Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information.
Peter AuerYifang ChenPratik GajaneChung-Wei LeeHaipeng LuoRonald OrtnerChen-Yu WeiPublished in: COLT (2019)
Keyphrases
- non stationary
- prior information
- regret bounds
- prior knowledge
- bayesian inference
- worst case
- multi armed bandit
- adaptive algorithms
- empirical mode decomposition
- autoregressive
- prior distribution
- white noise
- stock price
- linear regression
- temporal evolution
- multiresolution
- change point detection
- biomedical signals
- machine learning
- prior models
- online learning
- training data
- decision trees