Show and Speak: Directly Synthesize Spoken Description of Images.
Xinsheng WangSiyuan FengJihua ZhuMark Hasegawa-JohnsonOdette ScharenborgPublished in: CoRR (2020)
Keyphrases
- image data
- image database
- image analysis
- image features
- input image
- ground truth
- test images
- illumination conditions
- multiple images
- edge detection
- image understanding
- lighting conditions
- three dimensional
- image processing algorithms
- small number
- image registration
- rigid body
- object recognition
- original images
- image pixels
- segmentation method
- gabor filters
- image annotation
- image set
- image regions
- segmentation algorithm
- feature points
- image classification
- high level
- image matching
- image structure
- keypoints
- computer graphics
- image retrieval