Login / Signup
Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation.
Guojun Xiong
Jian Li
Published in:
CoRR (2023)
Keyphrases
</>
function approximation
reinforcement learning
neural network
learning tasks
temporal difference learning algorithms
function approximators
temporal difference learning
radial basis function
model free
artificial neural networks
temporal difference
optimal policy
mackey glass
multi armed bandits