Confronting Reward Model Overoptimization with Constrained RLHF.
Ted MoskovitzAaditya K. SinghDJ StrouseTuomas SandholmRuslan SalakhutdinovAnca D. DraganStephen McAleerPublished in: CoRR (2023)
Keyphrases
- computational model
- theoretical framework
- high level
- simulation model
- experimental data
- mathematical model
- parameter estimation
- theoretical analysis
- conceptual model
- image segmentation
- case study
- parameter values
- object model
- formal model
- hierarchical structure
- neural network
- network model
- statistical model
- input data
- probabilistic model
- dynamic programming
- prior knowledge
- information systems
- genetic algorithm
- data mining