Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation.
Guojun XiongJian LiPublished in: NeurIPS (2023)
Keyphrases
- function approximation
- reinforcement learning
- neural network
- radial basis function
- function approximators
- learning tasks
- temporal difference learning
- reinforcement learning algorithms
- model free
- temporal difference
- mackey glass
- state space
- multi armed bandits
- decision making
- active learning
- probability distribution