Login / Signup
Reward learning from human preferences and demonstrations in Atari.
Borja Ibarz
Jan Leike
Tobias Pohlen
Geoffrey Irving
Shane Legg
Dario Amodei
Published in:
CoRR (2018)
Keyphrases
</>
reinforcement learning
learning process
learning algorithm
learning tasks
supervised learning
active learning
function approximation
partially observable