COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval.

Published in: CVPR (2022)

Keyphrases