Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge.

Published in: CoRR (2024)

Keyphrases