Reinforcement Learning and Regret Bounds for Admission Control.

Lucas Weber Ana Busic Jiamin Zhu

Published in: CoRR (2024)

Keyphrases

admission control
reinforcement learning
control policy
regret bounds
multi armed bandit
end to end
atm networks
quality of service
production system
resource management
machine learning
state space
web server
online learning
lower bound
markov decision processes
resource consumption
optimal control
long run
linear regression
optimal policy
learning algorithm
multi class
dynamic programming
multi agent
bregman divergences
database systems
information retrieval