Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions.
Mohamadreza AhmadiAndrew SingletaryJoel W. BurdickAaron D. AmesPublished in: CoRR (2019)
Keyphrases
- multi agent
- partially observable markov decision processes
- finite state
- reinforcement learning
- optimal policy
- markov chain
- single agent
- markov decision processes
- policy search
- partially observable stochastic games
- policy gradient
- decision problems
- partially observable
- state space
- markov decision problems
- cooperative
- dynamical systems
- point based value iteration
- functional programs
- dec pomdps
- multi agent systems
- belief space
- planning under uncertainty
- markov decision process
- dynamic programming
- multiagent systems
- partial observability
- continuous state
- actor critic
- partially observable markov decision process
- finite horizon
- distributed constraint optimization
- expected reward
- functional language
- asymptotically optimal
- model free reinforcement learning