Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation.
Qifan ZhangYue GuanPanagiotis TsiotrasPublished in: CoRR (2020)
Keyphrases
- stochastic games
- nash equilibria
- markov decision processes
- multiagent reinforcement learning
- average reward
- incomplete information
- infinite horizon
- multi agent reinforcement learning
- optimal policy
- game theory
- nash equilibrium
- repeated games
- reinforcement learning algorithms
- multi agent
- reinforcement learning
- learning automata
- finite horizon
- average cost
- single agent
- machine learning
- learning agent
- multistage
- dynamic environments
- special case