Scaling Laws for Reward Model Overoptimization.
Leo GaoJohn SchulmanJacob HiltonPublished in: ICML (2023)
Keyphrases
- probabilistic model
- computational model
- similarity measure
- formal model
- objective function
- simulation model
- management system
- mathematical model
- em algorithm
- network model
- object model
- neural network model
- statistical model
- parameter estimation
- real time
- input data
- probability distribution
- multiscale
- case study
- clustering algorithm