MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training.
Pavan Kumar Anasosalu VasuHadi PouransariFartash FaghriRaviteja VemulapalliOncel TuzelPublished in: CoRR (2023)
Keyphrases
- multi modal
- multiple modalities
- image data
- auto annotation
- input image
- fusing multiple
- multi modality
- video search
- multiscale
- web images
- image classification
- single modality
- image collections
- image features
- image representation
- edge detection
- uni modal
- cross modal
- image analysis
- parametric models
- audio visual
- image annotation
- image content
- semantic concepts
- high resolution
- image segmentation
- feature vectors
- low level
- image retrieval
- segmentation method
- high dimensional