Login / Signup

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization.

Jiancong XiaoZiniu LiXingyu XieEmily J. GetzenCong FangQi LongWeijie J. Su
Published in: CoRR (2024)
Keyphrases