On Generalized Bellman Equations and Temporal-Difference Learning.

Huizhen Yu Ashique Rupam Mahmood Richard S. Sutton

Published in: CoRR (2017)

Keyphrases

temporal difference learning
function approximation
fixed point
evaluation function
game playing
reinforcement learning
temporal difference
approximate value iteration
markov decision process
reinforcement learning algorithms
policy iteration
neural network
cost function
least squares
sufficient conditions
model free