ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits.

Thomas Kleine Buening Aadirupa Saha

Published in: CoRR (2022)

Keyphrases

non stationary
adaptive algorithms
optimal solution
learning algorithm
worst case
detection algorithm
multiscale
np hard
expectation maximization
segmentation algorithm
dynamic programming
upper bound
online learning
confidence bounds