Login / Signup
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
Published in:
CoRR (2012)
Keyphrases
</>
contextual information
information systems
neural network
machine learning
real time
reinforcement learning
monte carlo
transfer function
sampled data
multi armed bandit
multi armed bandits