Settling the Variance of Multi-Agent Policy Gradients.
Jakub Grudzien KubaMuning WenYaodong YangLinghui MengShangding GuHaifeng ZhangDavid Henry MguniJun WangPublished in: CoRR (2021)
Keyphrases
- multi agent
- cooperative
- multi agent systems
- reinforcement learning
- multiple agents
- optimal policy
- policy making
- agent oriented
- intelligent agents
- correlation coefficient
- standard deviation
- asymptotically optimal
- cooperative agents
- heterogeneous agents
- partially observable markov decision processes
- action selection
- machine learning
- markov decision process
- infinite horizon
- multiagent systems
- team formation
- oriented programming