Sign in

Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games.

Wenhao ZhanJason D. LeeZhuoran Yang
Published in: CoRR (2022)
Keyphrases
  • online learning
  • learning algorithm
  • learning process
  • reinforcement learning
  • multi agent
  • game theory
  • worst case
  • linear programming
  • multiagent reinforcement learning