Sign in

RRHF: Rank Responses to Align Language Models with Human Feedback without tears.

Zheng YuanHongyi YuanChuanqi TanWei WangSongfang HuangFei Huang
Published in: CoRR (2023)
Keyphrases