Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization.
Dongsheng DingXiaohan WeiZhuoran YangZhaoran WangMihailo R. JovanovicPublished in: CoRR (2019)
Keyphrases
- primal dual
- temporal difference learning
- multi agent
- fixed point
- reinforcement learning
- linear programming
- saddle point
- function approximation
- linear program
- convex optimization
- algorithm for linear programming
- interior point methods
- approximation algorithms
- convergence rate
- semidefinite programming
- monte carlo
- evaluation function
- temporal difference
- game playing
- reinforcement learning algorithms
- state space
- markov decision process
- policy iteration
- cost function
- special case