Cross on Cross Attention: Deep Fusion Transformer for Image Captioning.
Jing ZhangYingshuai XieWeichao DingZhe WangPublished in: IEEE Trans. Circuits Syst. Video Technol. (2023)
Keyphrases
- image data
- fusion method
- image features
- single image
- input image
- image pixels
- image retrieval
- image analysis
- high resolution
- image collections
- image content
- image classification
- image sequences
- template matching
- test images
- segmentation method
- multiscale
- data fusion
- low level
- similarity measure
- fusion methods
- edge detection
- image matching
- multiresolution
- region of interest
- visual attention
- image set
- feature extraction
- image segmentation
- fusion process