Policy Gradient With Value Function Approximation For Collective Multiagent Planning.

Duc Thien Nguyen Akshat Kumar Hoong Chuin Lau

Published in: CoRR (2018)

Keyphrases

policy gradient
multiagent planning
state action
multiagent systems
reinforcement learning
function approximation
mechanism design
gradient method
reinforcement learning algorithms
optimal control
approximation methods
temporal difference learning
state space
variance reduction
average reward
evaluation function
temporal difference
stochastic games
function approximators
model free