Optimistic Whittle Index Policy: Online Learning for Restless Bandits.

Kai Wang Lily Xu Aparna Taneja Milind Tambe

Published in: AAAI (2023)

Keyphrases

online learning
semi markov
e learning
distance education
regret bounds
online course
optimal control
distance learning
higher education
optimal policy
blended learning
computer mediated
indexing method
database
active learning
b tree
asymptotically optimal
multi armed bandit
indexing techniques
stochastic systems
conservation laws
online learning environments
policy makers
markov decision processes
data sets