Sub-optimal Policy Aided Multi-Agent Reinforcement Learning for Flocking Control.
Yunbo QiuYue JinJian WangXudong ZhangPublished in: CoRR (2022)
Keyphrases
- optimal policy
- multi agent reinforcement learning
- reinforcement learning
- control policies
- decision problems
- markov decision processes
- finite horizon
- state space
- dynamic programming
- infinite horizon
- state dependent
- long run
- distributed control
- multistage
- optimal control
- markov decision process
- finite state
- multi agent
- control system
- learning agents
- control strategy
- sufficient conditions
- average reward
- linear programming
- average cost
- learning agent
- stochastic games
- reward function
- lost sales
- multi agent learning
- learning tasks