Extensively Explored and Evaluated Actor-Critic With Expert-Guided Policy Learning and Fuzzy Feedback Reward for Robotic Trajectory Generation.
Fengkang YingHuashan LiuRongxin JiangMenghua DongPublished in: IEEE Trans. Ind. Informatics (2022)
Keyphrases
- actor critic
- policy gradient
- reinforcement learning
- inverse reinforcement learning
- learning algorithm
- average reward
- policy gradient methods
- action selection
- temporal difference
- reinforcement learning algorithms
- reward function
- partially observable
- policy iteration
- state action
- optimal control
- approximate dynamic programming
- markov decision processes
- supervised learning
- active learning