Contextual Bandit Learning with Predictable Rewards.

Alekh Agarwal Miroslav Dudík Satyen Kale John Langford Robert E. Schapire

Published in: AISTATS (2012)

Keyphrases

learning systems
learning algorithm
reinforcement learning
learning process
online learning
information retrieval
feature selection
active learning
learning environment
probabilistic model
co occurrence