Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
Tiantian FengDaniel YangDigbalay BoseShrikanth NarayananPublished in: CoRR (2024)
Keyphrases
- multi modal
- visual recognition
- auto annotation
- cross modal
- image classification
- high level
- low level
- single modality
- multiple modalities
- multiscale
- uni modal
- image features
- similarity measure
- image segmentation
- visual categorization
- video search
- visual information
- input image
- high dimensional
- web images
- image content
- segmentation algorithm
- image representation
- computer vision