HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models.
Fuxiao LiuTianrui GuanZongxia LiLichang ChenYaser YacoobDinesh ManochaTianyi ZhouPublished in: CoRR (2023)
Keyphrases
- multi modality
- multimodal images
- input image
- single image
- multi modal
- multiscale
- high resolution
- image analysis
- medical images
- edge detection
- information theoretic
- image structure
- energy function
- bayesian framework
- image segmentation
- single modality
- vector field
- segmentation method
- objective function
- data mining
- feature points
- image registration