Unsupervised Improvement of Audio-Text Cross-Modal Representations.
Zhepei WangCem SubakanKrishna SubramaniJunkai WuTiago TavaresFábio AyresParis SmaragdisPublished in: CoRR (2023)
Keyphrases
- cross modal
- multi modal
- semantic representations
- multiple modalities
- multimedia retrieval
- visual recognition
- visual data
- image retrieval
- information retrieval
- higher level
- text retrieval
- visual similarity
- multimedia databases
- perceptual information
- object recognition
- text data
- image classification
- text mining
- multimedia