VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval.
Siteng HuangBiao GongYulin PanJianwen JiangYiliang LvYuyuan LiDonglin WangPublished in: CoRR (2022)
Keyphrases
- cross modal
- multiple modalities
- multi modal
- multimedia retrieval
- video search
- visual data
- text retrieval
- multimedia documents
- multimedia
- multimedia databases
- image retrieval
- information retrieval
- video data
- semantic concepts
- video content
- video sequences
- multimedia data
- visual similarity
- video streams
- visual recognition
- content based retrieval
- multimedia information retrieval
- text mining
- video analysis
- semantic content
- video frames
- web images
- video clips
- semantic similarity
- structured documents
- co occurrence
- image data
- text data
- key frames
- document retrieval