Geometry-Entangled Visual Semantic Transformer for Image Captioning.
Ling ChengWei WeiFeida ZhuYong LiuChunyan MiaoPublished in: CoRR (2021)
Keyphrases
- low level
- input image
- image data
- image content
- image classification
- image analysis
- visual concepts
- visually similar
- image features
- image segmentation
- visual data
- segmentation method
- visual perception
- image representation
- semantic gap
- visual cues
- multiscale
- auto annotation
- high level semantics
- image retrieval
- image regions
- geometric constraints
- image collections
- web images
- visual appearance
- single image
- visual features
- edge detection
- similarity measure
- geometric information
- semantic content
- human observers
- visual attributes
- visual effects
- three dimensional
- pixel values
- spatial relations
- spatial information
- semantic information
- segmentation algorithm
- high resolution
- image search
- low level features
- geometric features
- image set
- fault diagnosis
- visual similarity
- semantic space
- fuzzy logic
- object recognition