Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits.
Julian ZimmertTor LattimorePublished in: COLT (2022)
Keyphrases
- regret bounds
- worst case
- multi armed bandit
- error tolerance
- upper bound
- lower bound
- probability distribution
- tight bounds
- variance reduction
- linear regression
- asymptotically optimal
- semi infinite programming
- closed form
- average case
- expected loss
- optimal linear
- optimal cost
- multi agent
- dynamic programming
- wide range
- grassmann manifold
- closed form solutions
- arbitrarily close
- online learning
- optimality criterion
- large deviations
- optimal solution
- neyman pearson
- upper and lower bounds