Scaling Up Vision-Language Pre-training for Image Captioning.
Xiaowei HuZhe GanJianfeng WangZhengyuan YangZicheng LiuYumao LuLijuan WangPublished in: CoRR (2021)
Keyphrases
- image features
- input image
- image content
- image analysis
- image data
- multiscale
- image representation
- image retrieval
- low level image processing
- image classification
- image segmentation
- feature points
- similarity measure
- single image
- template matching
- high resolution
- low level
- programming language
- edge detection
- pixel values
- image collections
- language learning
- neural network
- visual perception
- region of interest
- image matching
- spatial information
- hough transform
- segmentation method
- natural language
- test images
- training examples
- image set
- super resolution
- vision system
- image pixels
- training data
- labeled images