Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval.
Qian WangJia-Chen GuZhen-Hua LingPublished in: CoRR (2024)
Keyphrases
- cross modal
- text retrieval
- multimedia retrieval
- multiscale
- image retrieval
- multi modal
- visual similarity
- multimedia information retrieval
- keypoints
- retrieval systems
- query expansion
- image processing
- image representation
- document collections
- similarity measure
- visual recognition
- information retrieval
- document retrieval
- visual data
- multimedia
- distance measure
- image content
- multimedia databases
- semantic similarity
- retrieval model
- information retrieval systems
- visual features
- image database
- visual information
- low level features
- text classification
- vector space
- video data