Policy Gradient With Value Function Approximation For Collective Multiagent Planning.
Duc Thien NguyenAkshat KumarHoong Chuin LauPublished in: NIPS (2017)
Keyphrases
- policy gradient
- multiagent planning
- multiagent systems
- state action
- reinforcement learning
- mechanism design
- gradient method
- function approximation
- optimal control
- reinforcement learning algorithms
- variance reduction
- temporal difference
- approximation methods
- temporal difference learning
- evaluation function
- average reward
- state space
- multi agent
- convergence rate
- search algorithm