LFQ: Online Learning of Per-flow Queuing Policies using Deep Reinforcement Learning.
Maximilian BachlJoachim FabiniTanja ZsebyPublished in: CoRR (2020)
Keyphrases
- online learning
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- control policies
- online course
- state space
- partially observable markov decision processes
- markov decision processes
- reward function
- distance education
- computer mediated
- e learning
- fitted q iteration
- policy gradient methods
- queuing systems
- markov decision problems
- control policy
- reinforcement learning algorithms
- higher education
- scheduling algorithm
- finite state
- round robin
- temporal difference
- distance learning
- function approximation
- reinforcement learning agents
- hierarchical reinforcement learning
- decision problems
- dynamic programming
- learning algorithm
- model free
- active learning
- flow field
- blended learning
- learning process
- long run
- multi agent reinforcement learning
- multi agent
- optimal control
- online learning environments
- state abstraction
- continuous state
- scheduling policies
- policy iteration