ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens.
Yangyang GuoHaoyu ZhangLiqiang NieYongkang WongMohan S. KankanhalliPublished in: CoRR (2023)
Keyphrases
- input image
- image data
- single image
- image representation
- image analysis
- image content
- image retrieval
- low level image processing
- image pixels
- computer vision
- multiscale
- image segmentation
- template matching
- image regions
- keypoints
- segmentation algorithm
- region of interest
- image synthesis
- image features
- image collections
- lighting conditions
- vector field
- image set
- color vision
- visual perception
- segmentation method
- image quality
- image classification
- edge detection
- high resolution
- test images
- vision system
- low level
- pixel values
- training set
- natural language
- similarity measure
- low level vision
- neural network
- real time