Gradient Temporal Difference with Momentum: Stability and Convergence.
Rohan DebShalabh BhatnagarPublished in: CoRR (2021)
Keyphrases
- temporal difference
- step size
- convergence rate
- reinforcement learning
- function approximation
- td learning
- learning rate
- evaluation function
- gradient method
- monte carlo
- model free
- temporal difference learning
- reinforcement learning algorithms
- convergence speed
- action selection
- policy gradient
- policy evaluation
- temporal difference methods
- supervised learning
- function approximators
- state space
- policy iteration
- actor critic
- decision trees
- least squares
- learning tasks
- linear combination