TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search

Jonathan Baxter Andrew Tridgell Lex Weaver

Published in: CoRR (1999)

Keyphrases

temporal difference learning
game tree search
game playing
evaluation function
fixed point
game tree
temporal difference
monte carlo
function approximation
alpha beta
imperfect information
video games
game play
reinforcement learning
learning outcomes
np hard
evolutionary algorithm