Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning.

Published in: ICCV (2021)

Keyphrases