Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games.
Chen-Yu WeiChung-Wei LeeMengxiao ZhangHaipeng LuoPublished in: COLT (2021)
Keyphrases
- infinite horizon
- multiagent reinforcement learning
- dec pomdps
- stochastic games
- markov decision processes
- finite horizon
- optimal control
- optimal policy
- long run
- multi agent
- dynamic programming
- cooperative
- cost function
- reinforcement learning algorithms
- convergence rate
- state space
- markov decision process
- multiagent systems
- lead time
- finite state
- partially observable
- average cost
- objective function
- convergence speed
- reinforcement learning
- step size
- single agent
- policy iteration
- multistage
- sufficient conditions
- graphical models