Login / Signup
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis.
Shaocong Ma
Yi Zhou
Shaofeng Zou
Published in:
NeurIPS (2020)
Keyphrases
</>
convergence analysis
learning algorithm
learning problems
support vector
search algorithm
dynamic programming
upper bound
optimization algorithm
learning tasks
global convergence