Login / Signup

Group Robust Preference Optimization in Reward-free RLHF.

Shyam Sundhar RameshYifan HuIason ChaimalasViraj MehtaPier Giuseppe SessaHaitham Bou-AmmarIlija Bogunovic
Published in: CoRR (2024)
Keyphrases