Can Large Multimodal Models Uncover Deep Semantics Behind Images?
Yixin YangZheng LiQingxiu DongHeming XiaZhifang SuiPublished in: ACL (Findings) (2024)
Keyphrases
- image data
- input image
- random fields
- image database
- image retrieval
- ground truth
- image features
- image analysis
- three dimensional
- rigid body
- object recognition
- geometric models
- computer graphics
- edge detection
- image registration
- image classification
- probabilistic model
- test images
- parametric models
- image collections
- image statistics
- visual effects
- lighting conditions
- logic programming
- feature points
- high resolution
- similarity measure