Towards Q-learning the Whittle Index for Restless Bandits.
Jing FuYoni NazarathySarat MokaPeter G. TaylorPublished in: ANZCC (2019)
Keyphrases
- reinforcement learning
- cooperative
- multi agent
- optimal control
- database
- function approximation
- learning algorithm
- stochastic systems
- semi markov
- state space
- conservation laws
- stochastic approximation
- index structure
- multi dimensional
- indexing techniques
- learning rate
- model free
- indexing method
- data structure
- multi agent reinforcement learning
- decision making
- data sets