Login / Signup
Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions.
Sébastien Bubeck
Thomas Budzinski
Mark Sellke
Published in:
COLT (2021)
Keyphrases
</>
multi armed bandit
cooperative
regret bounds
multi armed bandits
multi player
reinforcement learning
online game
lower bound
online learning
game playing
optimal solution
multi agent systems
information sharing
linear regression
educational games