The Unreasonable Effectiveness of CLIP Features for Image Captioning: An Experimental Analysis.
Manuele BarracoMarcella CorniaSilvia CascianelliLorenzo BaraldiRita CucchiaraPublished in: CVPR Workshops (2022)
Keyphrases
- image features
- extracting features
- input image
- low level
- single image
- test images
- image pixels
- grey level
- image classification
- image content
- image data
- extracted features
- multiscale
- matching process
- feature matching
- invariant features
- image analysis
- spatial distribution
- feature representation
- original images
- salient features
- feature vectors
- feature values
- individual features
- similarity measure
- textural features
- feature space
- image set
- global features
- image representation
- image description
- image matching
- high resolution
- spatial information
- keypoints
- feature set
- sample images
- color features
- discriminatory power
- color histogram
- video clips
- object recognition
- image retrieval
- feature points
- co occurrence
- object detection
- feature selection
- feature extractor
- image segmentation
- feature extraction
- relevance feedback
- visual appearance
- segmentation algorithm
- image regions
- spatial relationships
- image structure
- target object
- feature detection