On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence.
Nathaniel KordaPrashanth L. A.Published in: CoRR (2014)
Keyphrases
- function approximation
- temporal difference
- temporal difference learning
- td learning
- reinforcement learning
- reinforcement learning algorithms
- learning tasks
- model free
- temporal difference learning algorithms
- radial basis function
- td methods
- function approximators
- temporal difference methods
- policy evaluation
- convergence rate
- reinforcement learning methods
- convergence speed
- policy gradient
- least squares