Contextual bandits with surrogate losses: Margin bounds and efficient algorithms.
Dylan J. FosterAkshay KrishnamurthyPublished in: NeurIPS (2018)
Keyphrases
- regret bounds
- lower bound
- online learning
- contextual information
- upper bound
- linear regression
- upper and lower bounds
- support vector
- multi armed bandit
- lower and upper bounds
- context sensitive
- worst case bounds
- data sets
- bregman divergences
- np hard
- special case
- active learning
- training set
- reinforcement learning
- neural network