Actor-Critic Learning Control With Regularization and Feature Selection in Policy Gradient Estimation.
Luntong LiDazi LiTianheng SongXin XuPublished in: IEEE Trans. Neural Networks Learn. Syst. (2021)
Keyphrases
- actor critic
- feature selection
- policy gradient
- gradient estimation
- optimal control
- reinforcement learning
- policy gradient methods
- learning tasks
- temporal difference
- gradient method
- approximate dynamic programming
- monte carlo
- function approximation
- text categorization
- partially observable
- supervised learning
- support vector machine
- state space
- machine learning