CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation.
Wei ChenLin LiYongqi YangBin WenFan YangTingting GaoYu WuLong ChenPublished in: CoRR (2024)
Keyphrases
- input image
- image dataset
- image analysis
- image data
- image classification
- image features
- low level
- multiscale
- template matching
- similarity measure
- test images
- image content
- single image
- image representation
- edge detection
- high resolution
- text retrieval
- information retrieval
- image segmentation
- feature points
- scanned documents
- image collections
- hough transform
- image retrieval
- image set
- database
- street view
- spatial information
- segmentation method
- super resolution
- text mining
- keywords