Friend-or-Foe Deep Deterministic Policy Gradient.

Hao Jiang Dianxi Shi Chao Xue Yajie Wang Gongju Wang Yongjun Zhang

Published in: SMC (2020)

Keyphrases

policy gradient
parametric optimization
actor critic
reinforcement learning
gradient method
model free reinforcement learning
function approximation
optimal control
reinforcement learning algorithms
approximation methods
machine learning
average reward
neural network
reinforcement learning methods
state space
artificial neural networks
multi agent