Login / Signup
Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property.
Ioannis Anagnostides
Ioannis Panageas
Gabriele Farina
Tuomas Sandholm
Published in:
AAAI (2024)
Keyphrases
</>
policy gradient
multi player
reinforcement learning algorithms
game playing
reinforcement learning
convergence rate
multi agent
control system
upper bound
initial state