Login / Signup

Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property.

Ioannis AnagnostidesIoannis PanageasGabriele FarinaTuomas Sandholm
Published in: AAAI (2024)
Keyphrases
  • policy gradient
  • multi player
  • reinforcement learning algorithms
  • game playing
  • reinforcement learning
  • convergence rate
  • multi agent
  • control system
  • upper bound
  • initial state