Policy Gradient Adaptive Critic Designs for Model-Free Optimal Tracking Control With Experience Replay.
Mingduo LinBo ZhaoDerong LiuPublished in: IEEE Trans. Syst. Man Cybern. Syst. (2022)
Keyphrases
- model free
- reinforcement learning algorithms
- policy gradient
- average reward
- reinforcement learning
- function approximation
- optimal control
- temporal difference
- reinforcement learning methods
- dynamic programming
- policy iteration
- stochastic games
- real time
- control law
- computational intelligence
- markov decision processes
- nonlinear systems