Myopic Policy Bounds for Information Acquisition POMDPs.
Mikko LauriNikolay AtanasovGeorge J. PappasRisto RitalaPublished in: CoRR (2016)
Keyphrases
- information acquisition
- partially observable markov decision processes
- infinite horizon
- optimal policy
- policy search
- partially observable
- policy gradient
- reinforcement learning
- dynamic programming
- upper bound
- decision making
- markov decision processes
- lower bound
- selective perception
- point based value iteration
- finite state
- asymptotically optimal
- belief state
- markov decision problems
- state space
- decision problems
- markov decision process
- finite horizon
- expected reward
- policy gradient methods
- partially observable markov decision process
- multi agent
- dynamical systems
- worst case
- belief space
- continuous state spaces
- policy iteration
- upper and lower bounds
- lower and upper bounds
- control policies
- variance reduction
- optimal control
- planning problems