LLM Jailbreaks: Scrambled Words, Still Served Hot
·46 words·1 min
Jailbreaks are amazing window into LLMs. Brilliant hacks like below gives a lot of things think about:
-
Is conventional tokenization really needed at GPT scale given it can still process scrambled queries like below?
-
Is random ware code de novo given training data cleaning? https://x.com/lauriewired/status/1682825103594205186