Temporal Difference Updating without a Learning Rate.
Marcus HutterShane LeggPublished in: NIPS (2007)
Keyphrases
- learning rate
- temporal difference
- step size
- convergence rate
- td learning
- reinforcement learning
- evaluation function
- learning algorithm
- function approximation
- convergence speed
- monte carlo
- reinforcement learning algorithms
- rapid convergence
- model free
- policy iteration
- policy evaluation
- action selection
- supervised learning
- function approximators
- machine learning
- wavelet coefficients
- sufficient conditions
- markov decision process
- genetic algorithm