Login / Signup
Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds.
Zhiyu Lin
Brent Harrison
Aaron Keech
Mark O. Riedl
Published in:
CoRR (2017)
Keyphrases
</>
computational model
reinforcement learning
high level
mathematical model
markov decision process
machine learning
decision making
probabilistic model
probability distribution
optimal policy
experimental data
formal model
model free