Learning Guidance Rewards with Trajectory-space Smoothing.
Tanmay GangwaniYuan ZhouJian PengPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- learning algorithm
- learning process
- active learning
- supervised learning
- low dimensional
- learning systems
- data sets
- learning tasks
- online learning
- space time
- learning scenarios
- vector space
- background knowledge
- optimal policy
- unsupervised learning
- search space
- case study
- knowledge base
- machine learning