Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation.
Rui-Chen ZhengYang AiZhen-Hua LingPublished in: INTERSPEECH (2023)
Keyphrases
- audio visual
- visual data
- image data
- input image
- ultrasound images
- multi modal
- image database
- three dimensional
- image classification
- image features
- edge detection
- image collections
- speech enhancement
- image retrieval
- visual information
- noisy environments
- signal to noise ratio
- pattern recognition
- image regions
- image sequences
- high level
- multimedia
- speech signal
- vocal tract