Contextual bandits with surrogate losses: Margin bounds and efficient algorithms.

Dylan J. Foster Akshay Krishnamurthy

Published in: NeurIPS (2018)

Keyphrases

regret bounds
lower bound
online learning
contextual information
upper bound
linear regression
upper and lower bounds
support vector
multi armed bandit
lower and upper bounds
context sensitive
worst case bounds
data sets
bregman divergences
np hard
special case
active learning
training set
reinforcement learning
neural network