Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?
Mingqian FengYunlong TangZeliang ZhangChenliang XuPublished in: CoRR (2024)
Keyphrases
- input image
- multiscale
- image features
- single image
- post processing
- image classification
- image analysis
- image segmentation
- template matching
- image data
- high resolution
- edge detection
- image retrieval
- image collections
- image content
- image regions
- image pixels
- image noise
- keypoints
- image representation
- grey level
- energy function
- hough transform
- natural images
- vector field
- feature points
- moving objects
- image structure
- object recognition
- contrast enhancement
- pixel level
- image sequences