An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem.

Arpit Agarwal Rohan Ghuge Viswanath Nagarajan

Published in: CoRR (2022)

Keyphrases

asymptotically optimal
learning algorithm
real time
search space
dynamic programming
optimal solution
computational complexity
probabilistic model
simulated annealing
neural network
information systems
web services
similarity measure
worst case
response time
multi dimensional