Rate of Convergence and Error Bounds for LSTD(λ).

Manel Tagorti Bruno Scherrer

Published in: CoRR (2014)

Keyphrases

error bounds
reinforcement learning
temporal difference
least squares
theoretical analysis
temporal difference learning
policy evaluation
policy iteration
td learning
function approximation
worst case
convergence rate
evaluation function
fixed point
markov decision processes
finite state
model free
monte carlo
reinforcement learning algorithms
markov chain
state space
multi agent
optimal policy