Centralized Optimization for Dec-POMDPs Under the Expected Average Reward Criterion.
Xiaofeng JiangXiaodong WangHongsheng XiFalin LiuPublished in: IEEE Trans. Autom. Control. (2017)
Keyphrases
- average reward
- optimality criterion
- dec pomdps
- optimal policy
- infinite horizon
- markov decision processes
- long run
- partially observable markov decision processes
- dynamic programming
- stochastic games
- decision making
- machine learning
- policy iteration
- model free
- dynamic environments
- state action
- markov decision problems
- np hard