Login / Signup
Contextual Bandit Learning with Predictable Rewards
Alekh Agarwal
Miroslav Dudík
Satyen Kale
John Langford
Robert E. Schapire
Published in:
CoRR (2012)
Keyphrases
</>
reinforcement learning
learning systems
learning algorithm
supervised learning
learning process
natural language processing
online learning
unsupervised learning
contextual bandit
upper confidence bound