Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts.
Gilwoo LeeBrian HouSanjiban ChoudhurySiddhartha S. SrinivasaPublished in: CoRR (2020)
Keyphrases
- bayesian reinforcement learning
- optimal policy
- monte carlo tree search
- reinforcement learning
- markov decision processes
- decision problems
- long run
- partially observable markov decision processes
- dynamic programming
- state space
- infinite horizon
- average reward
- markov decision process
- monte carlo
- evaluation function
- constraint propagation
- markov decision problems
- machine learning