Gradient Temporal Difference with Momentum: Stability and Convergence.

Rohan Deb Shalabh Bhatnagar

Published in: CoRR (2021)

Keyphrases

temporal difference
step size
convergence rate
reinforcement learning
function approximation
td learning
learning rate
evaluation function
gradient method
monte carlo
model free
temporal difference learning
reinforcement learning algorithms
convergence speed
action selection
policy gradient
policy evaluation
temporal difference methods
supervised learning
function approximators
state space
policy iteration
actor critic
decision trees
least squares
learning tasks
linear combination