Augmenting Images for ASR and TTS Through Single-Loop and Dual-Loop Multimodal Chain Framework.
Johanes EffendiAndros TjandraSakriani SaktiSatoshi NakamuraPublished in: INTERSPEECH (2020)
Keyphrases
- image data
- multiple images
- input image
- image database
- object recognition
- image analysis
- image classification
- ground truth
- multi modal
- three dimensional
- edge detection
- image collections
- image retrieval
- image registration
- feature points
- image features
- segmentation method
- spatial information
- rigid body
- multiple modalities
- region of interest
- probabilistic model
- text to speech