Login / Signup
Batched Dueling Bandits.
Arpit Agarwal
Rohan Ghuge
Viswanath Nagarajan
Published in:
CoRR (2022)
Keyphrases
</>
stochastic systems
multi armed bandits
cost function
multi objective
learning algorithm
decision making
learning environment
np hard
online learning
sample path
multi armed bandit