Pre-training Cross-Modal Retrieval by Expansive Lexicon-Patch Alignment.
Yiyuan YangGuodong LongMichael BlumensteinXiubo GengChongyang TaoTao ShenDaxin JiangPublished in: LREC/COLING (2024)
Keyphrases
- cross modal
- multi modal
- multimedia retrieval
- image retrieval
- multimedia databases
- visual similarity
- visual recognition
- multimedia
- text retrieval
- visual data
- supervised learning
- training set
- information retrieval
- perceptual information
- training examples
- text categorization
- image database
- multimedia information retrieval