Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces.
Zachary N. SunbergMykel J. KochenderferPublished in: ICAPS (2018)
Keyphrases
- online algorithms
- state action
- action space
- reinforcement learning
- continuous state
- belief state
- markov decision processes
- average reward
- online learning
- learning algorithm
- state space
- markov decision process
- partially observable markov decision processes
- evaluation function
- worst case
- lower bound
- optimal policy
- partially observable
- asymptotically optimal
- stochastic games
- real valued
- average case
- reinforcement learning algorithms
- function approximation
- reward function
- learning process
- long run
- action selection
- finite state
- belief revision
- markov chain
- supervised learning
- principal components
- search space
- sufficient conditions
- upper bound