Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games.

Dingyang Chen Qi Zhang Thinh T. Doan

Published in: CoRR (2022)

Keyphrases

policy gradient
markov chain
reinforcement learning
markov model
actor critic
game theory
convergence rate
game playing