Text-Free Image-to-Speech Synthesis Using Learned Segmental Units.
Wei-Ning HsuDavid HarwathTyler MillerChristopher SongJames R. GlassPublished in: ACL/IJCNLP (1) (2021)
Keyphrases
- speech synthesis
- single image
- input image
- text to speech
- image data
- image analysis
- image classification
- image content
- image retrieval
- template matching
- image features
- multiscale
- image regions
- image segmentation
- web images
- edge detection
- image representation
- similarity measure
- region of interest
- low level
- image collections
- test images
- feature extraction
- information retrieval
- image pixels
- text information
- computer vision
- image matching
- segmentation method
- text mining
- pattern recognition