Login / Signup
Extended UCB Policy for Multi-Armed Bandit with Light-Tailed Reward Distributions
Keqin Liu
Qing Zhao
Published in:
CoRR (2011)
Keyphrases
</>
multi armed bandit
multi armed bandits
reinforcement learning
heavy tailed
gaussian distribution
decentralized decision making
probability distribution
learning algorithm
markov decision process
online learning
mixture model
theoretical guarantees
markov decision problems