Tuning Confidence Bound for Stochastic Bandits with Bandit Distance.
Xinyu ZhangSrinjoy DasKenneth Kreutz-DelgadoPublished in: CoRR (2021)
Keyphrases
- regret bounds
- online learning
- lower bound
- multi armed bandit
- linear regression
- upper bound
- high confidence
- confidence measure
- stochastic systems
- multi armed bandits
- bayesian networks
- distance measure
- distance function
- data sets
- parameter tuning
- minimum distance
- stochastic models
- bregman divergences
- mahalanobis distance
- information theoretic
- euclidean distance
- neural network