Counterfactual Multi-Agent Policy Gradients.
Jakob N. FoersterGregory FarquharTriantafyllos AfourasNantas NardelliShimon WhitesonPublished in: CoRR (2017)
Keyphrases
- multi agent
- optimal policy
- cooperative
- multi agent systems
- reinforcement learning
- agent oriented
- intelligent agents
- traffic signal control
- neural network
- software agents
- supply chain
- coalition formation
- multiple agents
- action selection
- single agent
- mobile robot
- partially observable markov decision processes
- markov decision process
- policy makers
- learning agents
- policy making
- policy search
- oriented programming
- information technology