On Generalized Bellman Equations and Temporal-Difference Learning.

Huizhen Yu Ashique Rupam Mahmood Richard S. Sutton

Published in: J. Mach. Learn. Res. (2018)

Keyphrases

temporal difference learning
fixed point
function approximation
reinforcement learning
game playing
evaluation function
approximate value iteration
temporal difference
markov decision process
reinforcement learning algorithms
monte carlo
function approximators
markov decision processes
markov random field
state space
policy iteration
neural network