PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting.
Zixin GuoTzu-Jui Julius WangSelen PehlivanAbduljalil RadmanJorma LaaksonenPublished in: CoRR (2023)
Keyphrases
- cross modal
- weakly supervised
- multi modal
- multimedia retrieval
- object detectors
- image retrieval
- multimedia databases
- visual similarity
- topic models
- superpixels
- computer vision
- object class
- semi supervised
- training set
- image processing
- training examples
- object categories
- object detection
- natural language
- multiscale
- information retrieval
- named entities
- bounding box