Towards Comprehensive and Efficient Post Safety Alignment of Large Language Models via Safety Patching.

Published in: CoRR (2024)

Keyphrases