Tight Regret Bounds for Infinite-armed Linear Contextual Bandits.

Yingkai Li Yining Wang Xi Chen Yuan Zhou

Published in: AISTATS (2021)

Keyphrases

regret bounds
lower bound
upper bound
online learning
linear regression
worst case
multi armed bandit
objective function
machine learning
e learning
similarity measure
optimal solution
information theoretic
bregman divergences
linear predictors
online convex optimization