A robust audio-visual speech recognition using audio-visual voice activity detection.
Satoshi TamuraMasato IshikawaTakashi HashibaShin'ichi TakeuchiSatoru HayamizuPublished in: INTERSPEECH (2010)
Keyphrases
- audio visual
- audio visual speech recognition
- person authentication
- noisy environments
- voice activity detection
- visual speech
- multi modal
- multi stream
- visual information
- speaker verification
- audio features
- visual data
- noise reduction
- multimedia
- speech recognition
- emotion recognition
- speaker identification
- hidden markov models
- video sequences
- noisy images
- image processing