Watch to Listen Clearly: Visual Speech Enhancement Driven Multi-modality Speech Recognition.
Bo XuJacob WangCheng LuYandong GuoPublished in: WACV (2020)
Keyphrases
- speech recognition
- multi modality
- speech enhancement
- noisy environments
- speech signal
- noisy speech
- multi modal
- medical images
- hidden markov models
- spectral subtraction
- speaker identification
- automatic speech recognition
- mutual information
- information theoretic
- image registration
- vocal tract
- background noise
- pattern recognition
- language model
- linear prediction
- additive noise
- noise reduction
- visual information
- visual features
- speech synthesis
- high dimensional
- signal to noise ratio