Smooth Contextual Bandits: Bridging the Parametric and Nondifferentiable Regret Regimes.
Yichun HuNathan KallusXiaojie MaoPublished in: Oper. Res. (2022)
Keyphrases
- regret bounds
- multi armed bandit problems
- multi armed bandit
- multi armed bandits
- contextual information
- online learning
- lower bound
- bandit problems
- loss function
- context sensitive
- stochastic systems
- expert advice
- worst case
- confidence bounds
- machine learning
- weighted majority
- parametric models
- parametric representation
- smooth surfaces
- binary classification
- data sets
- multi class
- upper bound
- support vector
- image sequences
- neural network