Tabular and Deep Reinforcement Learning for Gittins Index.

Harshit Dhankar Kshitij Mishra Tejas Bodas

Published in: CoRR (2024)

Keyphrases

reinforcement learning
learning algorithm
state space
index structure
function approximation
real time
learning process
reinforcement learning algorithms
temporal difference
action selection
structural similarity
partially observable
markov decision processes
transfer learning
temporal difference learning
case study
information systems
database
deep learning
policy search
indexing method
dynamic allocation
model free
inverted index
optimal control
b tree
optimal policy
supervised learning
machine learning