Login / Signup
A Minimaximalist Approach to Reinforcement Learning from Human Feedback.
Gokul Swamy
Christoph Dann
Rahul Kidambi
Zhiwei Steven Wu
Alekh Agarwal
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
function approximation
real time
human subjects
human interaction
human operators
artificial intelligence
learning process
human users
machine learning
information retrieval
creative problem solving