Best-of-Three-Worlds Linear Bandit Algorithm with Variance-Adaptive Regret Bounds.

Shinji Ito Kei Takemura

Published in: CoRR (2023)

Keyphrases

regret bounds
k means
learning algorithm
objective function
similarity measure
computational complexity
expectation maximization
worst case
closed form
multi armed bandit
probabilistic model
upper bound
graphical models
online learning
theoretical guarantees