Multi-modal Visual Understanding with Prompts for Semantic Information Disentanglement of Image.
Yuzhou PengPublished in: CoRR (2023)
Keyphrases
- multi modal
- semantic information
- auto annotation
- low level
- semantic concepts
- semantic gap
- visual information
- single modality
- cross modal
- image collections
- image data
- low level visual features
- image annotation
- wordnet
- fusing multiple
- low level features
- visual features
- image content
- high level
- web images
- visual concepts
- image segmentation
- visual cues
- image retrieval
- image classification
- multi modality
- semantic meaning
- image representation
- uni modal
- automatic image annotation
- image features
- audio visual
- multiple modalities
- domain knowledge
- video search
- semantic features
- keywords
- image database
- high dimensional
- higher level
- visual similarity
- contextual information
- artificial intelligence
- xml documents
- feature selection
- video sequences
- image regions
- visual data
- image search
- semantic similarity
- databases