MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs.
Ziyu LiuTao ChuYuhang ZangXilin WeiXiaoyi DongPan ZhangZijian LiangYuanjun XiongYu QiaoDahua LinJiaqi WangPublished in: CoRR (2024)
Keyphrases
- image analysis
- image data
- multiscale
- input image
- template matching
- image dataset
- image retrieval
- image classification
- single image
- weakly labeled
- image structure
- image collections
- image set
- image segmentation
- image content
- edge detection
- image database
- image features
- computer vision
- segmentation method
- image enhancement
- image sequences