Temporal Difference Updating without a Learning Rate
Marcus HutterShane LeggPublished in: CoRR (2008)
Keyphrases
- learning rate
- temporal difference
- convergence rate
- step size
- reinforcement learning
- td learning
- function approximation
- learning algorithm
- evaluation function
- monte carlo
- convergence speed
- model free
- action selection
- policy evaluation
- reinforcement learning algorithms
- supervised learning
- weight vector
- rapid convergence
- policy iteration
- machine learning
- function approximators
- delta bar delta
- active learning
- data mining