Can Large Multimodal Models Uncover Deep Semantics Behind Images?
Yixin YangZheng LiQingxiu DongHeming XiaZhifang SuiPublished in: CoRR (2024)
Keyphrases
- image data
- ground truth
- input image
- image registration
- image database
- image features
- image regions
- parametric models
- image classification
- image analysis
- image retrieval
- three dimensional
- multiple images
- rigid body
- multi modal
- illumination conditions
- segmentation method
- image set
- image collections
- visual effects
- region of interest
- test images
- single image
- lighting conditions
- image processing
- random fields
- computer graphics
- scale space
- formal semantics
- three dimensional objects
- mutual information
- multimodal image registration