Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem.
Junpei KomiyamaJunya HondaHisashi KashimaHiroshi NakagawaPublished in: COLT (2015)
Keyphrases
- lower bound
- worst case
- optimal solution
- dynamic programming
- competitive ratio
- objective function
- learning algorithm
- regret bounds
- np hard
- upper bound
- search space
- computational complexity
- online algorithms
- globally optimal
- preprocessing
- detection algorithm
- multi armed bandit
- optimal cost
- polynomial approximation
- average case
- cost function
- knapsack problem
- branch and bound
- lower and upper bounds
- markov chain
- linear programming
- least squares
- randomized algorithm
- state space
- k means
- clustering algorithm