Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward.
Kun ShaoYuanheng ZhuZhentao TangDongbin ZhaoPublished in: IJCNN (2020)
Keyphrases
- reinforcement learning
- cooperative multi agent
- function approximation
- state space
- eligibility traces
- learning algorithm
- model free
- reinforcement learning algorithms
- markov decision processes
- supervised learning
- reward function
- dynamic programming
- multi agent
- neural network
- policy search
- action selection
- temporal difference
- deep learning
- markov decision process
- artificial intelligence
- average reward
- transfer learning
- query language
- learning problems
- optimal policy
- learning process
- causal reasoning
- knowledge base
- total reward
- partially observable environments