Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval.

Published in: CoRR (2023)

Keyphrases