Adversarial Bandits with Corruptions: Regret Lower Bound and No-regret Algorithm.
Lin YangMohammad Hassan HajiesmailiMohammad Sadegh TalebiJohn C. S. LuiWing Shing WongPublished in: NeurIPS (2020)
Keyphrases
- lower bound
- worst case
- regret bounds
- optimal solution
- upper bound
- np hard
- objective function
- dynamic programming
- detection algorithm
- online convex optimization
- confidence bounds
- k means
- weighted majority
- computational complexity
- online learning
- multi armed bandit
- cost function
- branch and bound
- lower and upper bounds
- online algorithms
- competitive ratio
- randomized algorithm
- search space
- upper confidence bound
- active learning