Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes.

Yichun Hu Nathan Kallus Xiaojie Mao

Published in: CoRR (2019)

Keyphrases

regret bounds
loss function
multi armed bandit problems
multi armed bandits
multi armed bandit
contextual information
lower bound
bandit problems
online learning
objective function
expert advice
context sensitive
stochastic systems
worst case
machine learning
upper bound
regret minimization
confidence bounds
digital divide
parametric models
context aware
pairwise
image sequences