Reinforcement Learning from Imperfect Demonstrations.

Yang Gao Huazhe Xu Ji Lin Fisher Yu Sergey Levine Trevor Darrell

Published in: ICLR (Workshop) (2018)

Keyphrases

reinforcement learning
function approximation
reinforcement learning algorithms
learning algorithm
state space
direct policy search
temporal difference
optimal policy
markov decision processes
multi agent
multi agent reinforcement learning
model free
reinforcement learning methods
artificial intelligence
machine learning
relational reinforcement learning
autonomous learning
stochastic approximation
function approximators
partially observable
dynamic programming
decision trees
learning classifier systems
optimal control
supervised learning