Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games.
Chen-Yu WeiChung-Wei LeeMengxiao ZhangHaipeng LuoPublished in: CoRR (2021)
Keyphrases
- infinite horizon
- multiagent reinforcement learning
- dec pomdps
- stochastic games
- markov decision processes
- finite horizon
- optimal control
- optimal policy
- dynamic programming
- long run
- reinforcement learning algorithms
- state space
- cooperative
- markov decision process
- cost function
- multiagent systems
- multi agent
- lead time
- partially observable
- reinforcement learning
- average cost
- multistage
- probability distribution
- policy iteration
- convergence speed
- control problems
- machine learning
- convergence rate
- bayesian networks