Regret Bounds and Minimax Policies under Partial Monitoring.

Jean-Yves Audibert Sébastien Bubeck

Published in: J. Mach. Learn. Res. (2010)

Keyphrases

regret bounds
optimal policy
lower bound
worst case
multi armed bandit
image sequences
learning algorithm
feature selection
active learning
special case