Login / Signup
Regret Bounds for Thompson Sampling in Restless Bandit Problems.
Young Hun Jung
Ambuj Tewari
Published in:
CoRR (2019)
Keyphrases
</>
multi armed bandit
multi armed bandits
bandit problems
regret bounds
reinforcement learning
decision problems
lower bound
online learning
monte carlo
machine learning
linear regression