A Novel Entropy-Maximizing TD3-based Reinforcement Learning for Automatic PID Tuning.

Myisha A. Chowdhury Qiugang Lu

Published in: ACC (2023)

Keyphrases

reinforcement learning
temporal difference
reinforcement learning algorithms
function approximation
temporal difference learning
learning algorithm
eligibility traces
information theory
td learning
state space
control algorithm
information theoretic
mutual information
control system
multi agent
markov decision processes
neural network
learning process
semi automatic
control method
machine learning
policy evaluation
proportional integral derivative
information entropy
action selection
optimal control
evaluation function
optimal policy
multi objective