Bandit algorithms: Letting go of logarithmic regret for statistical robustness.
Kumar AshutoshJayakrishnan NairAnmol KagrechaKrishna P. JagannathanPublished in: AISTATS (2021)
Keyphrases
- worst case
- regret bounds
- learning algorithm
- computationally efficient
- bandit problems
- online algorithms
- computational efficiency
- orders of magnitude
- upper confidence bound
- neural network
- statistical methods
- times faster
- optimization problems
- computational cost
- significant improvement
- computational complexity
- theoretical analysis
- machine learning algorithms
- benchmark datasets
- online learning
- least squares
- statistical approaches
- lower bound
- multi armed bandit
- regret minimization
- support vector