Mild Action Blending Policy on Deep Reinforcement Learning with Discretized Actions for Process Control.
Yoshio TangeSatoshi KiryuTetsuro MatsuiPublished in: SICE (2020)
Keyphrases
- process control
- action selection
- reinforcement learning
- action space
- state action
- agent learns
- partially observable domains
- policy search
- state and action spaces
- continuous action
- control system
- state space
- partially observable
- product quality
- semiconductor manufacturing
- temporal difference
- reward function
- continuous state
- optimal policy
- intelligent control
- reinforcement learning algorithms
- function approximation
- markov decision processes
- reward signal
- evaluation function
- function approximators
- markov decision process
- dynamic programming
- joint action
- action models
- reinforcement learning methods
- multi agent
- perceptual aliasing
- agent receives
- manufacturing process
- transition model
- reasoning about actions
- learning agent
- learning algorithm
- inverse reinforcement learning
- multiagent reinforcement learning
- policy gradient
- state transitions
- reward shaping
- action descriptions
- partially observable markov decision processes
- model free
- primitive actions