Relational Attention with Textual Enhanced Transformer for Image Captioning.
Lifei SongYiwen ShiXinyu XiaoChunxia ZhangShiming XiangPublished in: PRCV (3) (2021)
Keyphrases
- image data
- input image
- image classification
- image content
- single image
- multiscale
- image features
- template matching
- image analysis
- high resolution
- low level
- data model
- vector field
- test images
- image matching
- image retrieval
- image segmentation
- lighting conditions
- image regions
- keypoints
- edge detection
- relational data
- neural network
- feature points
- pixel values
- image pixels
- image structure
- image representation
- grey level
- spatial information
- gray level
- segmentation algorithm
- image processing
- metadata
- computer vision