DOP: Off-Policy Multi-Agent Decomposed Policy Gradients.
Yihan WangBeining HanTonghan WangHeng DongChongjie ZhangPublished in: ICLR (2021)
Keyphrases
- multi agent
- cooperative
- multi agent systems
- multiagent systems
- single agent
- reinforcement learning
- agent oriented
- oriented programming
- policy making
- policy makers
- data sets
- traffic signal control
- cooperative agents
- heterogeneous agents
- gradient information
- action selection
- infinite horizon
- intelligent agents
- state dependent
- multiple agents
- coalition formation
- autonomous agents
- real time