The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond

Aurélien Garivier Olivier Cappé

Published in: CoRR (2011)

Keyphrases

learning algorithm
multi armed bandit
improved algorithm
times faster
high accuracy
preprocessing
dynamic programming
worst case
computational cost
significant improvement
cost function
expectation maximization
genetic algorithm
recognition algorithm
ant colony optimization
optimization algorithm
particle swarm optimization
linear programming
computational complexity
experimental evaluation
np hard
simulated annealing
neural network
detection algorithm
k means
monte carlo
optimal solution
similarity measure
stochastic approximation