Text-Free Image-to-Speech Synthesis Using Learned Segmental Units.
Wei-Ning HsuDavid HarwathChristopher SongJames R. GlassPublished in: CoRR (2020)
Keyphrases
- speech synthesis
- image features
- image data
- template matching
- image analysis
- multiscale
- text to speech
- image representation
- single image
- image classification
- high resolution
- low level
- scanned documents
- input image
- image pixels
- edge detection
- hidden markov models
- information retrieval
- image segmentation
- test images
- image content
- image retrieval
- speech recognition
- computer vision
- text information
- neural network
- similarity measure
- pixel values
- textual information
- image collections
- keywords
- web images
- text mining
- face recognition
- segmentation algorithm
- image regions
- segmentation method
- image restoration