Learning ε-Nash equilibrium stationary policies in stochastic games with unknown independent chains using online mirror descent.
Tiancheng QinS. Rasoul EtesamiPublished in: L4DC (2024)
Keyphrases
- stochastic games
- nash equilibrium
- nash equilibria
- markov decision processes
- reinforcement learning algorithms
- game theory
- repeated games
- incomplete information
- machine learning
- multi agent
- learning algorithm
- game theoretic
- single agent
- state space
- average reward
- reinforcement learning
- linear program
- linear programming
- learning tasks
- supervised learning
- pareto optimal
- infinite horizon
- learning process
- learning agent