CMMix: Cross-Modal Mix Augmentation Between Images and Texts for Visual Grounding.
Tao HongYa WangXingwu SunXiaoqing LiJinwen MaPublished in: ICONIP (12) (2023)
Keyphrases
- cross modal
- perceptual information
- visual similarity
- image retrieval
- visual data
- multi modal
- image database
- image data
- test images
- multiple modalities
- image collections
- visual features
- visual information
- multimedia retrieval
- visual concepts
- web images
- image understanding
- spatial relationships
- image regions
- image features
- visual recognition
- image classification
- low level
- object recognition
- multimedia databases
- visual content
- high level
- video sequences