Provable Policy Gradient Methods for Average-Reward Markov Potential Games.
Min ChengRuida ZhouP. R. KumarChao TianPublished in: AISTATS (2024)
Keyphrases
- average reward
- stochastic games
- policy gradient
- policy gradient methods
- markov decision processes
- markov chain
- long run
- actor critic
- optimal policy
- nash equilibria
- model free
- reinforcement learning
- markov model
- multi agent
- policy iteration
- game theory
- gradient method
- monte carlo
- dynamic environments
- natural actor critic