Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds.

Shinji Ito Taira Tsuchiya Junya Honda

Published in: CoRR (2022)

Keyphrases

multi armed bandit
learning algorithm
regret bounds
closed form
probabilistic model
worst case
optimal solution
upper bound
prediction error