Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits.

Subhojyoti Mukherjee Odalric-Ambrym Maillard

Published in: CoRR (2019)

Keyphrases

upper bound
lower bound
uniformly distributed
large deviations
worst case
random variables
spatial distribution
piecewise linear
power law
regret bounds
joint distribution
confidence bounds
expected loss
stochastic systems
upper and lower bounds
learning algorithm
reinforcement learning