Parallel bootstrap-based on-policy deep reinforcement learning for continuous flow control applications.

Jonathan Viquerat Elie Hachem

Published in: CoRR (2023)

Keyphrases

flow control
reinforcement learning
optimal policy
state dependent
action space
state space
policy search
low bandwidth
dynamic programming
markov decision processes
shared memory
long run
steady state
partially observable markov decision processes
computer simulation
end to end
queueing networks
average cost
control system
learning algorithm