Exploring the first-move balance point of Go-Moku based on reinforcement learning and Monte Carlo tree search.
Pengsen LiuJizhe ZhouJiancheng LvPublished in: Knowl. Based Syst. (2023)
Keyphrases
- monte carlo tree search
- reinforcement learning
- monte carlo
- temporal difference
- bayesian reinforcement learning
- reinforcement learning methods
- temporal difference learning
- tree search algorithm
- evaluation function
- function approximation
- reinforcement learning algorithms
- state space
- markov chain
- action selection
- monte carlo search
- learning algorithm
- markov decision processes
- alpha beta search
- fixed point
- lower bound