An improved approach to reinforcement learning in Computer Go.
Michael DannFabio ZambettaJohn ThangarajahPublished in: CIG (2015)
Keyphrases
- reinforcement learning
- temporal difference learning
- function approximation
- monte carlo
- temporal difference
- evaluation function
- monte carlo tree search
- game tree search
- reinforcement learning algorithms
- fixed point
- state space
- game playing
- markov decision process
- stochastic approximation
- multi agent
- transfer learning
- uct algorithm
- robotic control
- data mining
- model free
- markov decision processes
- optimal policy
- dynamic programming
- learning algorithm
- learning problems
- evolutionary algorithm
- reinforcement learning methods
- search space
- policy gradient
- machine learning
- data sets