Login / Signup
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction.
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
Published in:
CoRR (2019)
Keyphrases
</>
error reduction
reinforcement learning
classification error
state space
multi agent
semi supervised
iterative learning
learning algorithm
feature selection
classification accuracy
significant improvement
information extraction
optimal policy
class imbalance
model selection
pairwise
reward function
machine learning