TD methods applied to mixture of experts for learning 9×9 Go evaluation function.

Raonak Zaman Donald C. Wunsch

Published in: IJCNN (1999)

Keyphrases

evaluation function
td learning
temporal difference
reinforcement learning
temporal difference learning
temporal difference methods
learning process
learning algorithm
td methods
supervised learning
learning tasks
function approximation
state action
connectionist networks
active learning
least squares