Simulation-Based Algorithms for Markov Decision Processes: Monte Carlo Tree Search from AlphaGo to AlphaZero.
Michael C. FuPublished in: Asia Pac. J. Oper. Res. (2019)
Keyphrases
- markov decision processes
- policy iteration
- factored mdps
- monte carlo tree search
- state space
- reachability analysis
- orders of magnitude
- reinforcement learning
- finite state
- infinite horizon
- learning algorithm
- transition matrices
- optimal policy
- dynamic programming
- temporal difference learning
- reinforcement learning methods
- computational complexity
- multi agent systems
- stochastic shortest path