Login / Signup
Sequential Preference Ranking for Efficient Reinforcement Learning from Human Feedback.
Minyoung Hwang
Gunmin Lee
Hogun Kee
Chan Woo Kim
Kyungjae Lee
Songhwai Oh
Published in:
NeurIPS (2023)
Keyphrases
</>
reinforcement learning
web search
machine learning
human experts
ranking functions
neural network
human behavior
multi attribute
human users
multi criteria
preference learning