BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning.

Xinyue Chen Zijian Zhou Zheng Wang Che Wang Yanqiu Wu Qing Deng Keith W. Ross

Published in: CoRR (2019)

Keyphrases

imitation learning
reinforcement learning
action selection
action space
function approximation
reinforcement learning methods
learning algorithm
mirror neurons
reinforcement learning algorithms
multi agent
temporal difference
state space
control problems
optimal policy
optimal control
markov decision process
multi modal
machine learning
learning problems
humanoid robot
maximum margin
markov decision processes