C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Regret Bounds for Thompson Sampling in Restless Bandit Problems.
Young Hun Jung
Ambuj Tewari
Published in:
CoRR (2019)
Keyphrases
</>
multi armed bandit
multi armed bandits
bandit problems
regret bounds
reinforcement learning
decision problems
lower bound
online learning
monte carlo
machine learning
linear regression