Rate of Convergence and Error Bounds for LSTD(λ).
Manel TagortiBruno ScherrerPublished in: CoRR (2014)
Keyphrases
- error bounds
- reinforcement learning
- temporal difference
- least squares
- theoretical analysis
- temporal difference learning
- policy evaluation
- policy iteration
- td learning
- function approximation
- worst case
- convergence rate
- evaluation function
- fixed point
- markov decision processes
- finite state
- model free
- monte carlo
- reinforcement learning algorithms
- markov chain
- state space
- multi agent
- optimal policy