C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Sequential Preference Ranking for Efficient Reinforcement Learning from Human Feedback.
Minyoung Hwang
Gunmin Lee
Hogun Kee
Chan Woo Kim
Kyungjae Lee
Songhwai Oh
Published in:
NeurIPS (2023)
Keyphrases
</>
reinforcement learning
web search
machine learning
human experts
ranking functions
neural network
human behavior
multi attribute
human users
multi criteria
preference learning