Login / Signup
Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits.
Sivan Sabato
Published in:
NeurIPS (2019)
Keyphrases
</>
multi armed bandits
bandit problems
multi armed bandit problems
multi armed bandit
decision problems
long run
reinforcement learning
learning algorithm
bayesian networks
np hard
online advertising