Login / Signup

Learning to combine the modalities of language and video for temporal moment localization.

Jungkyoo ShinJinyoung Moon
Published in: Comput. Vis. Image Underst. (2022)
Keyphrases