BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs.
Sammie KattHai NguyenFrans A. OliehoekChristopher AmatoPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- markov decision processes
- continuous state
- adaptive control
- state space
- decision trees
- support vector
- partially observable markov decision processes
- actor critic
- learning algorithm
- policy gradient
- learning capabilities
- belief state
- model free
- finite state
- support vector machine
- probability distribution
- multi agent