Login / Signup
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval.
Haoyu Lu
Nanyi Fei
Yuqi Huo
Yizhao Gao
Zhiwu Lu
Ji-Rong Wen
Published in:
CoRR (2022)
Keyphrases
</>
cross modal
multi modal
information retrieval
computer vision
web pages
high level
low level
natural language processing
information retrieval systems
query expansion
action recognition
image understanding