Login / Signup
Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem.
Nadav Merlis
Shie Mannor
Published in:
CoRR (2019)
Keyphrases
</>
batch size
multi armed bandit
regret bounds
single item
batch mode
poisson process
cost function
finite horizon
batch processing
reinforcement learning
lower bound
special case
lot sizing