Reward estimation with scheduled knowledge distillation for dialogue policy learning.
Junyan QiuHaidong ZhangYiping YangPublished in: Connect. Sci. (2023)
Keyphrases
- knowledge acquisition
- learning systems
- prior knowledge
- supervised learning
- learning algorithm
- knowledge transfer
- reinforcement learning
- knowledge level
- learning process
- online learning
- knowledge management
- complex domains
- policy gradient
- inverse reinforcement learning
- background knowledge
- knowledge representation
- learning agent
- learned knowledge
- partially observable environments