Login / Signup
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning.
Rui Zheng
Wei Shen
Yuan Hua
Wenbin Lai
Shihan Dou
Yuhao Zhou
Zhiheng Xi
Xiao Wang
Haoran Huang
Tao Gui
Qi Zhang
Xuanjing Huang
Published in:
ICLR (2024)
Keyphrases
</>
learning community
cooperative learning
decision making
learning problems
reinforcement learning
learning process
language acquisition