Login / Signup
Near-optimal Per-Action Regret Bounds for Sleeping Bandits.
Quan M. Nguyen
Nishant Mehta
Published in:
AISTATS (2024)
Keyphrases
</>
regret bounds
multi armed bandit
lower bound
online learning
linear regression
upper bound
image sequences
reinforcement learning