Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning.

Published in: AAAI (2024)

Keyphrases