Login / Signup
Deep Reinforcement Learning from Human Preferences.
Paul F. Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
Published in:
NIPS (2017)
Keyphrases
</>
reinforcement learning
learning algorithm
decision making
multi agent
model free
data sets
human computer interaction
markov decision processes
human interaction
human operators
deep learning
artificial intelligence
optimal policy
computational models
function approximation
human behavior