Using twin-HMM-based audio-visual speech enhancement as a front-end for robust audio-visual speech recognition.
Ahmed Hussen AbdelazizSteffen ZeilerDorothea KolossaPublished in: INTERSPEECH (2013)
Keyphrases
- audio visual speech recognition
- noisy environments
- audio visual
- speech enhancement
- multi stream
- hidden markov models
- multi modal
- speech recognition
- noise reduction
- speaker verification
- visual information
- speech signal
- automatic speech recognition
- multiscale
- multimedia
- visual data
- contextual information
- video sequences