Offline Reinforcement Learning for Bandwidth Estimation in RTC Using a Fast Actor and Not-So-Furious Critic.
Ekrem ÇetinkayaAhmet PehlivanogluIhsan U. AytenBasar YumakogullariMehmet E. OzgunYigit K. ErincEnes DenizAli C. BegenPublished in: MMSys (2024)
Keyphrases
- reinforcement learning
- reinforcement learning algorithms
- function approximation
- temporal difference
- actor critic
- real time
- accurate estimation
- machine learning
- model free
- learning algorithm
- multi agent
- learning process
- state space
- policy gradient
- bandwidth utilization
- network bandwidth
- step size
- markov decision processes
- dynamic programming
- neural network