Exploring Policy Diversity in Parallel Actor-Critic Learning.
Yanqiang ZhangYuanzhao ZhaiGongqian ZhouBo DingDawei FengSongwang LiuPublished in: ICTAI (2022)
Keyphrases
- actor critic
- reinforcement learning
- policy gradient
- learning algorithm
- policy gradient methods
- optimal control
- gradient method
- learning tasks
- action selection
- natural actor critic
- sufficient conditions
- mathematical model
- neuro fuzzy
- temporal difference
- supervised learning
- approximate dynamic programming
- active learning
- machine learning