On the Rate of Convergence and Error Bounds for LSTD(\(\lambda\)).
Manel TagortiBruno ScherrerPublished in: ICML (2015)
Keyphrases
- error bounds
- reinforcement learning
- temporal difference
- least squares
- theoretical analysis
- policy evaluation
- temporal difference learning
- policy iteration
- function approximation
- td learning
- worst case
- evaluation function
- model free
- monte carlo
- fixed point
- convergence rate
- neural network
- markov decision processes
- machine learning
- convergence speed
- variance reduction
- multi agent
- control problems
- function approximators
- learning algorithm