Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem.
Junpei KomiyamaJunya HondaHisashi KashimaHiroshi NakagawaPublished in: CoRR (2015)
Keyphrases
- worst case
- lower bound
- optimal solution
- regret bounds
- competitive ratio
- dynamic programming
- np hard
- computational complexity
- k means
- upper bound
- learning algorithm
- detection algorithm
- search space
- probabilistic model
- preprocessing
- objective function
- optimal cost
- lower and upper bounds
- globally optimal
- multi armed bandit
- genetic algorithm
- online algorithms
- random sampling
- approximation algorithms
- linear programming
- exhaustive search
- linear regression
- constant factor
- branch and bound
- markov decision processes