C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems.
Young Hun Jung
Ambuj Tewari
Published in:
NeurIPS (2019)
Keyphrases
</>
multi armed bandit
multi armed bandits
bandit problems
regret bounds
reinforcement learning
linear regression
decision problems
decision trees
special case