Login / Signup

Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble.

Shun ZhangZhenfang ChenSunli ChenYikang ShenZhiqing SunChuang Gan
Published in: CoRR (2024)
Keyphrases