Bounded Policy Synthesis for POMDPs with Safe-Reachability Objectives.
Yue WangSwarat ChaudhuriLydia E. KavrakiPublished in: CoRR (2018)
Keyphrases
- partially observable markov decision processes
- optimal policy
- state space
- policy search
- reinforcement learning
- partially observable
- policy gradient
- markov decision problems
- markov decision processes
- asymptotically optimal
- finite state
- point based value iteration
- policy iteration algorithm
- continuous state
- infinite horizon
- belief state
- expected reward
- continuous state spaces
- finite horizon
- policy gradient methods
- markov decision process
- dynamic programming
- policy iteration
- dynamical systems
- partially observable markov decision process
- decision problems
- decision processes
- partially observable environments
- multiple objectives
- machine learning
- multi agent
- texture synthesis
- action space
- single agent
- program synthesis
- transitive closure
- dec pomdps
- markov chain
- control policies
- state dependent