Login / Signup
Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling.
Yuping Luo
Huazhe Xu
Tengyu Ma
Published in:
CoRR (2019)
Keyphrases
</>
learning algorithm
learning process
prior knowledge
online learning
unsupervised learning
learning systems
reinforcement learning
multi agent
monte carlo
positive and negative
incremental learning