Reward Model Ensembles Help Mitigate Overoptimization.

Thomas Coste Usman Anwar Robert Kirk David Krueger

Published in: CoRR (2023)

Keyphrases

probabilistic model
experimental data
neural network
machine learning
multi agent
cost function
formal model
generative model
computational model
learning models
classification models
neural network model
conceptual model
hierarchical structure
mathematical model
theoretical analysis
management system
high level
case study