Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning.
R. SrikantLei YingPublished in: CoRR (2019)
Keyphrases
- error bounds
- stochastic approximation
- td learning
- temporal difference
- monte carlo
- policy iteration
- theoretical analysis
- temporal difference learning
- evaluation function
- worst case
- function approximation
- policy evaluation
- model free
- reinforcement learning
- neural network
- reinforcement learning algorithms
- linear combination
- function approximators
- theoretical guarantees
- sufficient conditions
- training data
- learning algorithm