TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent.
Alexandra KearneyVivek VeeriahJaden B. TravnikRichard S. SuttonPatrick M. PilarskiPublished in: CoRR (2018)
Keyphrases
- step size
- temporal difference
- td learning
- convergence rate
- convergence speed
- cost function
- evolutionary programming
- reinforcement learning algorithms
- policy evaluation
- monte carlo
- temporal difference methods
- policy iteration
- gradient method
- function approximators
- decision trees
- wavelet coefficients
- particle swarm optimization
- computational complexity
- image segmentation
- machine learning