Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret.
Orestis PapadigenopoulosConstantine CaramanisSanjay ShakkottaiPublished in: NeurIPS (2022)
Keyphrases
- non stationary
- regret bounds
- game theory
- autoregressive
- lower bound
- multi armed bandits
- adaptive algorithms
- online learning
- loss function
- multi armed bandit
- concept drift
- empirical mode decomposition
- multi armed bandit problems
- change point detection
- white noise
- temporal evolution
- stock price
- planning problems
- multi component
- incomplete information
- gaussian mixture model
- image sequences