An efficient and robust gradient reinforcement learning: Deep comparative policy.
Jiaguo WangWenheng LiChao LeiMeng YangYang PeiPublished in: J. Intell. Fuzzy Syst. (2024)
Keyphrases
- reinforcement learning
- optimal policy
- policy gradient
- policy search
- computationally efficient
- function approximation
- markov decision process
- highly efficient
- policy iteration
- approximate dynamic programming
- dynamic programming
- action selection
- transition model
- reinforcement learning algorithms
- neural network
- actor critic
- control problems
- control policy
- state dependent
- action space
- temporal difference
- learning algorithm