Self-imitation guided goal-conditioned reinforcement learning.
Yao LiYuHui WangXiaoyang TanPublished in: Pattern Recognit. (2023)
Keyphrases
- reinforcement learning
- function approximation
- state space
- temporal difference
- learning algorithm
- multi agent
- machine learning
- dynamic programming
- learning process
- evolutionary algorithm
- active learning
- bayesian networks
- optimal policy
- learning problems
- decision trees
- real time
- robot control
- temporal difference learning
- agent learns
- robotic control