Sign in

A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily.

Peng DingJun KuangDan MaXuezhi CaoYunsen XianJiajun ChenShujian Huang
Published in: CoRR (2023)
Keyphrases