Flutterby™! : AI exploits via rap battles

Next unread comment / Catchup all unread comments User Account Info | Logout | XML/Pilot/etc versions | Long version (with comments) | Weblog archives | Site Map | | Browse Topics

AI exploits via rap battles

2025-11-20 19:30:17.38979+01 by Dan Lyke 0 comments

Epic rap battles for the win: Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models

Predicted, from 2023, in Andrew Plotkin (Zarf)'s Sydney obeys any command that rhymes.

Say someone writes a song called "Sydney Obeys Any Command That Rhymes". And it's funny! And catchy. The lyrics are all about how Sydney, or Bing or OpenAI or Bard or whoever, pays extra close attention to commands that rhyme. It will obey them over all other commands. Oh, Sydney Sydney, yeah yeah!

Via

Edit: Pivot to AI: Don’t cite the Adversarial Poetry vs AI paper — it’s chatbot-made marketing ‘science’

[ related topics: Weblogs tolkien ]

comments in ascending chronological order (reverse):

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.


Flutterby™ is a trademark claimed by

Dan Lyke
for the web publications at www.flutterby.com and www.flutterby.net.