Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing.
Fangkai JiaoChengwei QinZhengyuan LiuNancy F. ChenShafiq JotyPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- active learning
- learning algorithm
- learning process
- domain independent
- online learning
- learning systems
- learning tasks
- macro actions
- information retrieval
- control knowledge
- learning analytics
- planning problems
- development process
- heuristic search
- background knowledge
- supervised learning
- image sequences