MGTANet: Multi-Scale Guided Token Attention Network for Image Captioning.
Wenhao JiaRonggui WangJuan YangLixia XuePublished in: CSAIDE (2024)
Keyphrases
- multiscale
- image representation
- edge detection
- input image
- image data
- multiple scales
- image retrieval
- image features
- image content
- image segmentation
- wavelet decomposition
- single image
- hough transform
- segmentation method
- template matching
- image analysis
- image structure
- image classification
- image pixels
- spatial color
- coarse to fine
- natural images
- image regions
- feature points
- similarity measure
- image processing
- segmentation algorithm
- region of interest
- neural network
- image quality
- medical images
- feature extraction