Show and Speak: Directly Synthesize Spoken Description of Images.
Xinsheng WangSiyuan FengJihua ZhuMark Hasegawa-JohnsonOdette ScharenborgPublished in: ICASSP (2021)
Keyphrases
- input image
- image data
- ground truth
- image classification
- image database
- image features
- image registration
- three dimensional
- image retrieval
- image analysis
- edge detection
- image understanding
- multiple images
- test images
- small number
- rigid body
- segmentation method
- image collections
- feature points
- image processing algorithms
- spatial information
- image matching
- fully automatic
- computer vision
- original images
- pixel values
- satellite images
- image set
- image regions
- computer graphics
- speech recognition
- similarity measure
- high level