LFQ: Online Learning of Per-flow Queuing Policies using Deep Reinforcement Learning.
Maximilian BachlJoachim FabiniTanja ZsebyPublished in: LCN (2020)
Keyphrases
- online learning
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- control policies
- reward function
- function approximation
- computer mediated
- online course
- partially observable markov decision processes
- higher education
- hierarchical reinforcement learning
- e learning
- fitted q iteration
- markov decision processes
- control policy
- decision problems
- markov decision problems
- state space
- dynamic programming
- reinforcement learning agents
- blended learning
- distance education
- distance learning
- active learning
- policy gradient methods
- flow patterns
- temporal difference
- reinforcement learning algorithms
- end to end
- online learning environments
- policy iteration
- multi agent
- reinforcement learning methods
- finite state
- continuous state
- online algorithms
- average cost
- partially observable
- machine learning
- learning process
- model free
- optimal control
- classroom learning
- transfer learning
- learning environment
- learning algorithm