MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions.

Published in: CoRR (2021)

Keyphrases