A Novel Entropy-Maximizing TD3-based Reinforcement Learning for Automatic PID Tuning.
Myisha A. ChowdhuryQiugang LuPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- temporal difference
- function approximation
- reinforcement learning algorithms
- temporal difference learning
- control system
- model free
- learning algorithm
- state space
- information theory
- machine learning
- eligibility traces
- fully automatic
- mutual information
- multi agent
- learning process
- temperature control
- dynamic programming
- control algorithm
- semi automatic
- markov decision processes
- parameter tuning
- control method
- function approximators
- reinforcement learning methods
- policy evaluation
- policy search
- td learning
- real time