Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling.

Yuping Luo Huazhe Xu Tengyu Ma

Published in: ICLR (2020)

Keyphrases

learning process
online learning
reinforcement learning
machine learning
knowledge base
sample size
learning tasks
learning problems
inductive inference