Multi-agent Policy Reciprocity with Theoretical Guarantee.

Haozhi Wang Yinchuan Li Qing Wang Yunfeng Shao Jianye Hao

Published in: CoRR (2023)

Keyphrases

theoretical guarantees
multi agent
policy iteration
reinforcement learning
optimal policy
worst case
cooperative
multiple agents
markov decision processes
multiagent systems
heterogeneous agents
single agent
machine learning
intelligent agents
multi agent systems
reward function
average cost
markov decision process
lower bound
learning algorithm