Login / Signup
Dataset Reset Policy Optimization for RLHF.
Jonathan D. Chang
Wenhao Zhan
Owen Oertell
Kianté Brantley
Dipendra Misra
Jason D. Lee
Wen Sun
Published in:
CoRR (2024)
Keyphrases
</>
optimization process
optimization algorithm
optimization problems
discrete optimization
global optimization
real life
constrained optimization
database
feature set
optimal policy
optimization method
combinatorial optimization
convex optimization
training dataset
optimization strategies