Login / Signup
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits.
Aadirupa Saha
Shubham Gupta
Published in:
ICML (2022)
Keyphrases
</>
non stationary
worst case
adaptive algorithms
learning algorithm
regret bounds
computationally efficient
multi armed bandit
lower bound
online learning
expert advice
dynamic programming
loss function
stock price
confidence bounds