Login / Signup
Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem.
Nadav Merlis
Shie Mannor
Published in:
COLT (2019)
Keyphrases
</>
batch size
multi armed bandit
regret bounds
batch mode
single item
batch processing
poisson process
reinforcement learning
lower bound
finite horizon
active learning
supervised learning