Login / Signup
Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit.
Djallel Bouneffouf
Srinivasan Parthasarathy
Horst Samulowitz
Martin Wistuba
Published in:
CoRR (2019)
Keyphrases
</>
lower bound
reinforcement learning
online learning
information theoretic