On the Asymptotic Behaviour of a Constant Stepsize Temporal-Difference Learning Algorithm.

Vladislav Tadic

Published in: EuroCOLT (1999)

Keyphrases

temporal difference
step size
learning algorithm
reinforcement learning algorithms
reinforcement learning
td learning
policy evaluation
function approximation
supervised learning
evaluation function
convergence rate
cost function
model free
temporal difference learning
monte carlo
convergence speed
machine learning
machine learning algorithms
training data
action selection
active learning
policy iteration
state space
function approximators
learning process
markov decision processes