Login / Signup
Adaptive playouts for online learning of policies during Monte Carlo Tree Search.
Tobias Graf
Marco Platzner
Published in:
Theor. Comput. Sci. (2016)
Keyphrases
</>
online learning
monte carlo tree search
bayesian reinforcement learning
monte carlo
tree search algorithm
evaluation function
optimal policy
alpha beta search
training data
reinforcement learning
dynamic environments
neural network
machine learning
monte carlo search