Bounded Policy Iteration for Decentralized POMDPs.
Daniel S. BernsteinEric A. HansenShlomo ZilbersteinPublished in: IJCAI (2005)
Keyphrases
- policy iteration
- markov decision processes
- dec pomdps
- infinite horizon
- reinforcement learning
- policy iteration algorithm
- markov decision problems
- optimal policy
- partially observable markov decision processes
- finite state
- dynamic programming
- average reward
- sample path
- markov decision process
- model free
- distributed constraint optimization
- partially observable
- long run
- state space
- policy evaluation
- fixed point
- optimal control
- least squares
- multi agent
- function approximation
- continuous state
- average cost
- lost sales
- temporal difference
- reinforcement learning algorithms
- reward function
- search algorithm
- bayesian networks
- finite number
- actor critic
- np hard
- dynamical systems
- machine learning