Login / Signup
Everyone Deserves A Reward: Learning Customized Human Preferences.
Pengyu Cheng
Jiawen Xie
Ke Bai
Yong Dai
Nan Du
Published in:
CoRR (2023)
Keyphrases
</>
reinforcement learning
learning process
learning algorithm
learning systems
learning scheme
prior knowledge
language acquisition
knowledge acquisition
learning problems
human experts
data sets
recommender systems
supervised learning
online learning
human subjects
human learning