Audio-visual speaker identification using dynamic facial movements and utterance phonetic content.
Vahid AsadpourMohammad Mehdi HomayounpourFarzad TowhidkhahPublished in: Appl. Soft Comput. (2011)
Keyphrases
- audio visual
- speech recognition
- speaker identification
- emotion recognition
- audio features
- multi modal
- multimedia
- speech signal
- visual information
- visual data
- hidden markov models
- noisy environments
- language model
- gaussian mixture model
- multimedia data
- pattern recognition
- automatic speech recognition
- metadata
- multimedia content
- face recognition
- image processing
- broadcast news
- feature selection