An Optimal High Probability Algorithm for the Contextual Bandit Problem
Alina BeygelzimerJohn LangfordLihong LiLev ReyzinRobert E. SchapirePublished in: CoRR (2010)
Keyphrases
- dynamic programming
- optimal solution
- objective function
- contextual bandit
- worst case
- preprocessing
- np hard
- learning algorithm
- upper confidence bound
- detection algorithm
- probabilistic model
- cost function
- computational complexity
- information retrieval
- information extraction
- semi supervised
- simulated annealing
- segmentation algorithm
- high efficiency
- globally optimal
- optimality criterion