Settling the Variance of Multi-Agent Policy Gradients.

Jakub Grudzien Kuba Muning Wen Yaodong Yang Linghui Meng Shangding Gu Haifeng Zhang David Henry Mguni Jun Wang

Published in: CoRR (2021)

Keyphrases

multi agent
cooperative
multi agent systems
reinforcement learning
multiple agents
optimal policy
policy making
agent oriented
intelligent agents
correlation coefficient
standard deviation
asymptotically optimal
cooperative agents
heterogeneous agents
partially observable markov decision processes
action selection
machine learning
markov decision process
infinite horizon
multiagent systems
team formation
oriented programming