Login / Signup
Bandits all the way down: UCB1 as a simulation policy in Monte Carlo Tree Search.
Edward Jack Powley
Daniel Whitehouse
Peter I. Cowling
Published in:
CIG (2013)
Keyphrases
</>
monte carlo tree search
monte carlo
bayesian reinforcement learning
tree search algorithm
stochastic systems
optimal policy
multi armed bandit
monte carlo search
evaluation function
mathematical model
dynamic programming