Label-attention transformer with geometrically coherent objects for image captioning.
Shikha DubeyFarrukh OlimovMuhammad Aasim RafiqueJoonmo KimMoongu JeonPublished in: Inf. Sci. (2023)
Keyphrases
- image regions
- single image
- multiscale
- image pixels
- image data
- image analysis
- image features
- multiple objects
- image retrieval
- image content
- edge detection
- static images
- keypoints
- semantic labels
- individual objects
- low level
- bounding box
- spatial relationships
- image collections
- complex scenes
- input image
- region of interest
- geometric constraints
- scene understanding
- segmentation algorithm
- pixel values
- visual appearance
- spatial relations
- target object
- image segmentation
- test images
- segmentation method
- image representation
- high resolution
- image matching
- focus of attention
- similar objects
- real world objects
- d objects
- object features
- image segments
- foreground background separation