On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits.

Naumaan Nayyar Dileep M. Kalathil Rahul Jain

Published in: CoRR (2015)

Keyphrases

multi armed bandits
online learning
learning process
bandit problems
learning algorithm
worst case
multi armed bandit
multi agent
dynamic programming
statistically significant
situated learning
multi player