Multi-Stage Temporal Difference Learning for 2048.
I-Chen WuKun-Hao YehChao-Chin LiangChia-Chuan ChangHan ChiangPublished in: TAAI (2014)
Keyphrases
- multistage
- temporal difference learning
- function approximation
- fixed point
- reinforcement learning
- evaluation function
- game playing
- dynamic programming
- approximate value iteration
- lot sizing
- single stage
- temporal difference
- reinforcement learning algorithms
- markov decision process
- optimal policy
- policy iteration
- markov decision processes
- dynamical systems
- attitudes toward
- monte carlo
- sufficient conditions
- data mining