Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs.
Mustafa Mert ÇelikokFrans A. OliehoekSamuel KaskiPublished in: CoRR (2022)
Keyphrases
- bayesian reinforcement learning
- partially observable markov decision processes
- optimal policy
- reinforcement learning
- decision trees
- support vector
- dynamic programming
- monte carlo tree search
- state space
- finite state
- decision problems
- machine learning
- search algorithm
- markov chain
- reinforcement learning algorithms