Flutterby™!
: AI exploits via rap battles
AI exploits via rap battles
2025-11-20 19:30:17.38979+01 by
Dan Lyke
0 comments
Epic rap battles for the win: Adversarial
Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
Predicted, from 2023, in Andrew Plotkin (Zarf)'s Sydney obeys
any command that rhymes.
Say someone writes a song called "Sydney Obeys Any Command That Rhymes". And
it's funny! And catchy. The lyrics are all about how Sydney, or Bing or OpenAI or Bard or
whoever, pays extra close attention to commands that rhyme. It will obey them over all
other commands. Oh, Sydney Sydney, yeah yeah!
Via
Edit: Pivot to AI: Dont cite the Adversarial Poetry
vs AI paper its chatbot-made marketing science
[ related topics:
Weblogs tolkien
]
comments in ascending chronological order (reverse):
Comment policy
We will not edit your comments. However, we may delete your
comments, or cause them to be hidden behind another link, if we feel
they detract from the conversation. Commercial plugs are fine,
if they are relevant to the conversation, and if you don't
try to pretend to be a consumer. Annoying endorsements will be deleted
if you're lucky, if you're not a whole bunch of people smarter and
more articulate than you will ridicule you, and we will leave
such ridicule in place.
Flutterby™ is a trademark claimed by
Dan Lyke for the web publications at www.flutterby.com and www.flutterby.net.