Improved Regret for Zeroth-Order Stochastic Convex Bandits.

Tor Lattimore András György

Published in: COLT (2021)

Keyphrases

monte carlo
regret bounds
machine learning
binary classification
multi armed bandits