Login / Signup
The Non-Bayesian Restless Multi-Armed Bandit: A Case of Near-Logarithmic Strict Regret
Wenhan Dai
Yi Gai
Bhaskar Krishnamachari
Qing Zhao
Published in:
CoRR (2011)
Keyphrases
</>
multi armed bandit
regret bounds
multi armed bandits
worst case
reinforcement learning
online learning
optimal control
feature selection
multi class
posterior probability
linear regression
multi armed bandit problems