Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation.
Haeun YooBoeun KimJong Woo KimJay H. LeePublished in: Comput. Chem. Eng. (2021)
Keyphrases
- monte carlo
- optimal control
- policy gradient
- actor critic
- reinforcement learning
- variance reduction
- dynamic programming
- control problems
- control strategy
- importance sampling
- markov chain
- infinite horizon
- function approximation
- particle filter
- policy evaluation
- control law
- temporal difference
- rl algorithms
- optimal policy
- learning algorithm
- partially observable markov decision processes
- policy iteration
- confidence intervals
- temporal difference learning
- radial basis function