ExpansionNet: exploring the sequence length bottleneck in the Transformer for Image Captioning.

Published in: CoRR (2022)

Keyphrases

input image
image segmentation
image features
single image
image analysis
image data
multiscale
template matching
image classification
image noise
image retrieval
image collections
high resolution
segmentation method
image pixels
image representation
image content
test images
image matching
input data
neural network
hough transform
edge detection
lighting conditions
similarity measure
image structure
pixel values
fixed length