Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions.
Yifei XinYuexian ZouPublished in: INTERSPEECH (2023)
Keyphrases
- cross modal
- text retrieval
- multimedia retrieval
- image retrieval
- multi modal
- visual features
- multimedia information retrieval
- document collections
- image search
- query expansion
- information retrieval
- retrieval systems
- visual similarity
- visual recognition
- visual content
- multimedia databases
- multimedia
- visual data
- document retrieval
- text categorization
- image database
- image annotation
- content based retrieval
- image content
- image representation
- keywords