Login / Signup
Bandit-Based Policy Optimization for Monte Carlo Tree Search in RTS Games.
Zuozhi Yang
Santiago Ontañón
Published in:
AIIDE Workshops (2020)
Keyphrases
</>
monte carlo tree search
monte carlo
monte carlo search
bayesian reinforcement learning
tree search algorithm
evaluation function
optimization problems
markov chain
game tree
optimal policy
temporal difference
temporal difference learning
supervised learning
infinite horizon