Multimodal speaker/speech recognition using lip motion, lip texture and audio.
Hasan Ertan ÇetingülEngin ErzinYücel YemezA. Murat TekalpPublished in: Signal Process. (2006)
Keyphrases
- visual speech
- audio visual
- audio visual speech recognition
- hidden markov models
- lip reading
- speaker identification
- video signals
- multi stream
- noisy environments
- visual data
- speech signal
- multi modal
- visual information
- audio signals
- multimodal fusion
- motion model
- image sequences
- broadcast news
- acoustic features
- gaussian mixture model
- speech recognition
- optical flow
- text to speech
- space time
- mouth region
- feature extraction
- automatic transcription
- prosodic features
- speaker dependent
- speaker verification
- motion features
- motion estimation
- multimodal interaction
- motion analysis
- audio stream
- multimedia
- speaker recognition
- speech synthesis
- music information retrieval
- moving objects
- video sequences