Login / Signup
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
Wenhan Dai
Yi Gai
Bhaskar Krishnamachari
Qing Zhao
Published in:
CoRR (2010)
Keyphrases
</>
multi armed bandit
regret bounds
multi armed bandits
reinforcement learning
worst case
online learning
lower bound
upper bound
maximum likelihood
maximum entropy
decentralized decision making
bayesian networks
optimal solution