A Novel Entropy-Maximizing TD3-based Reinforcement Learning for Automatic PID Tuning.
Myisha A. ChowdhuryQiugang LuPublished in: ACC (2023)
Keyphrases
- reinforcement learning
- temporal difference
- reinforcement learning algorithms
- function approximation
- temporal difference learning
- learning algorithm
- eligibility traces
- information theory
- td learning
- state space
- control algorithm
- information theoretic
- mutual information
- control system
- multi agent
- markov decision processes
- neural network
- learning process
- semi automatic
- control method
- machine learning
- policy evaluation
- proportional integral derivative
- information entropy
- action selection
- optimal control
- evaluation function
- optimal policy
- multi objective