Policy Gradient With Value Function Approximation For Collective Multiagent Planning.
Duc Thien NguyenAkshat KumarHoong Chuin LauPublished in: CoRR (2018)
Keyphrases
- policy gradient
- multiagent planning
- state action
- multiagent systems
- reinforcement learning
- function approximation
- mechanism design
- gradient method
- reinforcement learning algorithms
- optimal control
- approximation methods
- temporal difference learning
- state space
- variance reduction
- average reward
- evaluation function
- temporal difference
- stochastic games
- function approximators
- model free