CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum.

Shuang Ao Tianyi Zhou Guodong Long Qinghua Lu Liming Zhu Jing Jiang

Published in: NeurIPS (2021)

Keyphrases

reinforcement learning
action selection
goal oriented
function approximation
learning algorithm
partially observable
state space
macro actions
collaborative learning
planning process
cooperative learning
optimal policy
heuristic search
markov decision processes
peer assessment
complex domains
machine learning
dynamic programming
multi agent
learning classifier systems
computer supported collaborative learning
high school
temporal difference
knowledge sharing
domain independent
partially observable markov decision processes
control system
reinforcement learning methods
neural network