Safer Deep RL with Shallow MCTS: A Case Study in Pommerman.
Bilal KartalPablo Hernandez-LealChao GaoMatthew E. TaylorPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- monte carlo tree search
- natural language processing
- case study
- test bed
- deep learning
- question answering
- multi agent
- hand crafted
- neural network
- markov decision processes
- wall street journal
- optimal policy
- complex domains
- temporal difference
- belief nets
- optimal control
- monte carlo
- information extraction
- dynamic programming
- search space
- search algorithm
- bayesian networks
- knowledge base
- learning algorithm