Emphatic Temporal-Difference Learning.

Ashique Rupam Mahmood Huizhen Yu Martha White Richard S. Sutton

Published in: CoRR (2015)

Keyphrases

temporal difference learning
fixed point
function approximation
reinforcement learning
approximate value iteration
evaluation function
game playing
temporal difference
reinforcement learning algorithms
monte carlo
model selection
policy iteration
dynamical systems
markov decision process
function approximators
least squares
pairwise