Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions.
Yifei XinYuexian ZouPublished in: CoRR (2023)
Keyphrases
- cross modal
- text retrieval
- multimedia retrieval
- image retrieval
- multi modal
- visual features
- information retrieval
- document retrieval
- visual content
- multimedia information retrieval
- query expansion
- image search
- retrieval systems
- visual data
- visual recognition
- document collections
- image database
- multimedia
- test collection
- retrieval model
- visual similarity