Parallel bootstrap-based on-policy deep reinforcement learning for continuous flow control applications.
Jonathan ViqueratElie HachemPublished in: CoRR (2023)
Keyphrases
- flow control
- reinforcement learning
- optimal policy
- state dependent
- action space
- state space
- policy search
- low bandwidth
- dynamic programming
- markov decision processes
- shared memory
- long run
- steady state
- partially observable markov decision processes
- computer simulation
- end to end
- queueing networks
- average cost
- control system
- learning algorithm