Login / Signup
Robust Preference Optimization through Reward Model Distillation.
Adam Fisch
Jacob Eisenstein
Vicky Zayats
Alekh Agarwal
Ahmad Beirami
Chirag Nagpal
Petet Shaw
Jonathan Berant
Published in:
CoRR (2024)
Keyphrases
</>
probabilistic model
theoretical analysis
computational model
optimization model
mathematical model
probability distribution
response surface
neural network
graphical representation
formal model
theoretical framework
cost function
multi objective
similarity measure
high level
image segmentation
social networks