Tight Regret Bounds for Single-pass Streaming Multi-armed Bandits.

Published in: CoRR (2023)

Keyphrases

single pass
multi armed bandits
multi armed bandit
regret bounds
lower bound
stream mining
upper bound
data streams
reinforcement learning
worst case
online learning
optimal solution
np hard
learning algorithm
probability distribution
bandit problems