Adaptive Step-Size for Online Temporal Difference Learning.
William DabneyAndrew G. BartoPublished in: AAAI (2012)
Keyphrases
- step size
- temporal difference learning
- temporal difference
- variable step size
- convergence rate
- cost function
- convergence speed
- fixed point
- evaluation function
- function approximation
- game playing
- reinforcement learning
- reinforcement learning algorithms
- monte carlo
- markov decision process
- policy iteration
- blind source separation
- wavelet coefficients