The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Aurélien GarivierOlivier CappéPublished in: CoRR (2011)
Keyphrases
- learning algorithm
- multi armed bandit
- improved algorithm
- times faster
- high accuracy
- preprocessing
- dynamic programming
- worst case
- computational cost
- significant improvement
- cost function
- expectation maximization
- genetic algorithm
- recognition algorithm
- ant colony optimization
- optimization algorithm
- particle swarm optimization
- linear programming
- computational complexity
- experimental evaluation
- np hard
- simulated annealing
- neural network
- detection algorithm
- k means
- monte carlo
- optimal solution
- similarity measure
- stochastic approximation