Vision-Enhanced and Consensus-Aware Transformer for Image Captioning.
Shan CaoGaoyun AnZhenxing ZhengZhiyong WangPublished in: IEEE Trans. Circuits Syst. Video Technol. (2022)
Keyphrases
- image data
- input image
- multiscale
- image content
- image features
- visual perception
- single image
- image retrieval
- image representation
- segmentation method
- image classification
- image regions
- low level
- high resolution
- image collections
- computer vision
- image pixels
- template matching
- image analysis
- image segmentation
- real time
- keypoints
- region of interest
- grey level
- image synthesis
- image processing
- human visual system
- test images
- image structure
- image matching
- image restoration
- edge detection