Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features.
Nicola MessinaGiuseppe AmatoFabrizio FalchiClaudio GennaroStéphane Marchand-MailletPublished in: CBMI (2021)
Keyphrases
- cross modal
- multi modal
- visual similarity
- multimedia retrieval
- multimedia databases
- image retrieval
- low level
- feature extraction
- perceptual information
- multimedia
- visual recognition
- visual data
- image features
- image classification
- visual features
- indexing structure
- feature vectors
- feature space
- keywords
- document retrieval
- information retrieval systems
- image database
- co occurrence