Tuning Confidence Bound for Stochastic Bandits with Bandit Distance.

Xinyu Zhang Srinjoy Das Kenneth Kreutz-Delgado

Published in: CoRR (2021)

Keyphrases

regret bounds
online learning
lower bound
multi armed bandit
linear regression
upper bound
high confidence
confidence measure
stochastic systems
multi armed bandits
bayesian networks
distance measure
distance function
data sets
parameter tuning
minimum distance
stochastic models
bregman divergences
mahalanobis distance
information theoretic
euclidean distance
neural network