A Minimaximalist Approach to Reinforcement Learning from Human Feedback.

Gokul Swamy Christoph Dann Rahul Kidambi Zhiwei Steven Wu Alekh Agarwal

Published in: CoRR (2024)

Keyphrases

reinforcement learning
function approximation
real time
human subjects
human interaction
human operators
artificial intelligence
learning process
human users
machine learning
information retrieval
creative problem solving