Incremental Truncated LSTD.

Clement Gehring Martha White

Published in: CoRR (2015)

Keyphrases

reinforcement learning
temporal difference
least squares
policy evaluation
policy iteration
temporal difference learning
function approximation
td learning
model free
markov decision processes
step size
reinforcement learning algorithms
evaluation function
monte carlo
linear approximation
control problems
fixed point
multi step
state space
markov decision process
supervised learning
cooperative