Login / Signup
Group Robust Preference Optimization in Reward-free RLHF.
Shyam Sundhar Ramesh
Yifan Hu
Iason Chaimalas
Viraj Mehta
Pier Giuseppe Sessa
Haitham Bou-Ammar
Ilija Bogunovic
Published in:
CoRR (2024)
Keyphrases
</>
optimization process
optimization algorithm
robust optimization
reinforcement learning
video sequences
computationally efficient
user preferences
global optimization
discrete optimization
simultaneous optimization
least squares
personality traits
norm minimization