Login / Signup

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models.

Liwei JiangKavel RaoSeungju HanAllyson EttingerFaeze BrahmanSachin KumarNiloofar MireshghallahXiming LuMaarten SapYejin ChoiNouha Dziri
Published in: CoRR (2024)
Keyphrases