Login / Signup

A Minimaximalist Approach to Reinforcement Learning from Human Feedback.

Gokul SwamyChristoph DannRahul KidambiZhiwei Steven WuAlekh Agarwal
Published in: CoRR (2024)
Keyphrases