Safer Deep RL with Shallow MCTS: A Case Study in Pommerman.

Bilal Kartal Pablo Hernandez-Leal Chao Gao Matthew E. Taylor

Published in: CoRR (2019)

Keyphrases

reinforcement learning
monte carlo tree search
natural language processing
case study
test bed
deep learning
question answering
multi agent
hand crafted
neural network
markov decision processes
wall street journal
optimal policy
complex domains
temporal difference
belief nets
optimal control
monte carlo
information extraction
dynamic programming
search space
search algorithm
bayesian networks
knowledge base
learning algorithm