Polynomial Convergence of Bandit No-Regret Dynamics in Congestion Games.
Leello Tadesse DadiIoannis PanageasStratis SkoulakisLuca VianoVolkan CevherPublished in: CoRR (2024)
Keyphrases
- bandit problems
- congestion games
- regret bounds
- upper confidence bound
- initial conditions
- multi armed bandit problems
- online learning
- nash equilibria
- dynamical systems
- multi armed bandit
- expert advice
- contextual bandit
- pure nash equilibria
- random sampling
- convergence rate
- game theory
- reinforcement learning
- pure nash equilibrium
- loss function
- worst case
- multi agent