Dual Formulations for Optimizing Dec-POMDP Controllers.
Akshat KumarHala MostafaShlomo ZilbersteinPublished in: ICAPS (2016)
Keyphrases
- reinforcement learning
- hidden state
- control system
- state space
- finite state
- dynamical systems
- partially observable
- partially observable markov decision processes
- belief state
- partially observable markov decision process
- belief space
- control law
- optimal policy
- optimization methods
- function approximation
- machine learning
- markov decision process
- partial observability
- decision makers
- model free reinforcement learning