Login / Signup
Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games.
Wenhao Zhan
Jason D. Lee
Zhuoran Yang
Published in:
CoRR (2022)
Keyphrases
</>
online learning
learning algorithm
learning process
reinforcement learning
multi agent
game theory
worst case
linear programming
multiagent reinforcement learning