Bandit algorithms: Letting go of logarithmic regret for statistical robustness.
Kumar AshutoshJayakrishnan NairAnmol KagrechaKrishna P. JagannathanPublished in: CoRR (2020)
Keyphrases
- worst case
- regret bounds
- learning algorithm
- times faster
- information theoretic
- orders of magnitude
- upper confidence bound
- computational complexity
- online learning
- benchmark datasets
- bandit problems
- lower bound
- computationally efficient
- loss function
- data structure
- computational efficiency
- expert advice
- statistical methods
- data mining
- online convex optimization
- theoretical analysis
- statistical analysis
- computational cost
- dynamic programming
- significant improvement
- decision trees