Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games.
Hongyi GuoZuyue FuZhuoran YangZhaoran WangPublished in: ICML (2021)
Keyphrases
- stochastic games
- average reward
- reinforcement learning algorithms
- multiagent reinforcement learning
- nash equilibria
- markov decision processes
- repeated games
- multi agent
- reinforcement learning
- nash equilibrium
- long run
- learning automata
- optimal policy
- model free
- rl algorithms
- policy iteration
- temporal difference
- single agent
- cooperative
- learning algorithm
- infinite horizon
- imperfect information
- incomplete information
- state space
- machine learning
- learning agent
- finite state