Training a robust reinforcement learning controller for the uncertain system based on policy gradient method.
Zhan LiShengri XueWeiyang LinMingsi TongPublished in: Neurocomputing (2018)
Keyphrases
- actor critic
- gradient method
- policy gradient
- reinforcement learning
- robust stability
- convergence rate
- step size
- optimal control
- optimal policy
- approximate dynamic programming
- optimization methods
- negative matrix factorization
- control policy
- action selection
- average reward
- supervised learning
- machine learning
- information extraction
- state space
- training set
- objective function