ExpansionNet: exploring the sequence length bottleneck in the Transformer for Image Captioning.
Jia-Cheng HuPublished in: CoRR (2022)
Keyphrases
- input image
- image segmentation
- image features
- single image
- image analysis
- image data
- multiscale
- template matching
- image classification
- image noise
- image retrieval
- image collections
- high resolution
- segmentation method
- image pixels
- image representation
- image content
- test images
- image matching
- input data
- neural network
- hough transform
- edge detection
- lighting conditions
- similarity measure
- image structure
- pixel values
- fixed length