TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning.

George Dimitri Konidaris Scott Niekum Philip S. Thomas

Published in: NIPS (2011)

Keyphrases

temporal difference learning
function approximation
fixed point
temporal difference
evaluation function
reinforcement learning
game playing
approximate value iteration
reinforcement learning algorithms
markov decision process
linear programming
sufficient conditions
machine learning algorithms
learning tasks
policy iteration