Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning.

Published in: NeurIPS (2022)

Keyphrases