Login / Signup
Reward Model Ensembles Help Mitigate Overoptimization.
Thomas Coste
Usman Anwar
Robert Kirk
David Krueger
Published in:
ICLR (2024)
Keyphrases
</>
formal model
statistical model
decision trees
high level
management system
theoretical analysis
computational model
probabilistic model
mathematical model
search engine
video sequences
prior knowledge
parameter estimation
experimental data
prediction model
learning models