Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization.

Published in: CoRR (2024)

Keyphrases