Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval.
Yizhen ChenJie WangLijian LinZhongang QiJin MaYing ShanPublished in: CoRR (2023)
Keyphrases
- multi modal
- text retrieval
- semantic concepts
- video search
- part of speech
- social tagging
- tag recommendation
- metadata
- information retrieval
- multiple modalities
- video sequences
- multimedia
- multi modality
- document retrieval
- video data
- image retrieval
- query expansion
- document collections
- retrieval model
- video content
- image annotation
- multimedia information retrieval
- video frames
- audio visual
- retrieval systems
- video database
- high dimensional
- key frames
- multimedia data
- visual concepts
- relevant documents
- video analysis
- keywords
- high level