Online Planning in POMDPs with Self-Improving Simulators.
Jinke HeMiguel SuauHendrik BaierMichael KaisersFrans A. OliehoekPublished in: CoRR (2022)
Keyphrases
- partially observable markov decision processes
- stochastic domains
- planning problems
- reinforcement learning
- partially observable
- belief state
- partial observability
- belief space
- planning under uncertainty
- real time
- dynamical systems
- heuristic search
- decision support
- online learning
- action selection
- ai planning
- plan execution
- state space
- multi agent
- travel planning
- point based value iteration