Optimistic Whittle Index Policy: Online Learning for Restless Bandits.
Kai WangLily XuAparna TanejaMilind TambePublished in: CoRR (2022)
Keyphrases
- online learning
- e learning
- semi markov
- distance learning
- online course
- higher education
- computer mediated
- blended learning
- stochastic systems
- optimal policy
- distance education
- index structure
- conservation laws
- database
- regret bounds
- online algorithms
- optimal control
- indexing method
- corporate training
- action selection
- similarity search
- dynamic pricing
- multi dimensional
- dynamic programming
- active learning
- decision trees