Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation.

Published in: CoRR (2023)

Keyphrases