Login / Signup

Prototypical Reward Network for Data-Efficient RLHF.

Jinghan ZhangXiting WangYiqiao JinChangyu ChenXinhao ZhangKunpeng Liu
Published in: CoRR (2024)
Keyphrases