Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments.
Guan-Lin ChaoWilliam ChanIan R. LanePublished in: INTERSPEECH (2016)
Keyphrases
- speech recognition
- audio visual
- audio visual speech recognition
- acoustic models
- hidden markov models
- multi modal
- language model
- pattern recognition
- speaker recognition
- speech recognizer
- visual information
- noisy environments
- speech signal
- automatic speech recognition
- speaker identification
- speaker verification
- multi stream
- speaker diarization
- speaker independent
- digit recognition
- probabilistic model
- speech synthesis
- visual features
- speaker adaptation
- multimedia
- data mining
- visual data
- n gram