Login / Signup

DESTEIN: Navigating Detoxification of Language Models via Universal Steering Pairs and Head-wise Activation Fusion.

Yu LiZhihua WeiHan JiangChuanyang Gong
Published in: CoRR (2024)
Keyphrases