Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability.
Sergey SamsonovDaniil TiapkinAlexey NaumovEric MoulinesPublished in: COLT (2024)
Keyphrases
- temporal difference
- reinforcement learning
- learning algorithm
- exponential stability
- reinforcement learning algorithms
- function approximation
- evaluation function
- sufficient conditions
- td learning
- model free
- supervised learning
- policy evaluation
- action selection
- step size
- monte carlo
- hopfield neural network
- neural network
- machine learning algorithms
- learning scheme
- active learning
- cellular neural networks
- function approximators
- evolutionary algorithm