Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis.

Shaocong Ma Yi Zhou Shaofeng Zou

Published in: CoRR (2020)

Keyphrases

convergence analysis
learning algorithm
cost function
learning tasks
reinforcement learning
state space
supervised learning
learning problems