Login / Signup
Efficient Exploration with Self-Imitation Learning via Trajectory-Conditioned Policy.
Yijie Guo
Jongwook Choi
Marcin Moczulski
Samy Bengio
Mohammad Norouzi
Honglak Lee
Published in:
CoRR (2019)
Keyphrases
</>
imitation learning
reinforcement learning
robotic systems
humanoid robot
optimal policy
maximum margin
support vector
action selection
real time
multi modal
machine learning
feature selection
markov random field
action space
reinforcement learning methods