Login / Signup

Robust Preference Optimization through Reward Model Distillation.

Adam FischJacob EisensteinVicky ZayatsAlekh AgarwalAhmad BeiramiChirag NagpalPetet ShawJonathan Berant
Published in: CoRR (2024)
Keyphrases