Login / Signup

Defending LLMs against Jailbreaking Attacks via Backtranslation.

Yihan WangZhouxing ShiAndrew BaiCho-Jui Hsieh
Published in: CoRR (2024)
Keyphrases