A Benchmark for Multi-modal Foundation Models on Low-level Vision: from Single Images to Pairs.
Zicheng ZhangHaoning WuErli ZhangGuangtao ZhaiWeisi LinPublished in: CoRR (2024)
Keyphrases
- multi modal
- image annotation
- low level vision
- fusing multiple
- image understanding
- input image
- image classification
- image analysis
- high dimensional
- multiple modalities
- single modality
- probabilistic model
- object recognition
- high level vision
- image features
- image retrieval
- image collections
- audio visual
- image regions
- multi modality
- cross modal
- image matching
- image registration
- web images
- similarity measure
- active contours
- graphical models
- high quality