Why did TD-Gammon Work?

Jordan B. Pollack Alan D. Blair

Published in: NIPS (1996)

Keyphrases

temporal difference
td learning
learning algorithm
reinforcement learning
reinforcement learning algorithms
evaluation function
temporal difference learning
eligibility traces
function approximation
computer vision
support vector
state space
sufficient conditions
monte carlo
action selection