Real-time audio-visual voice activity detection for speech recognition in noisy environments.
Carlos Toshinori IshiMiki SatoNorihiro HagitaShihong LaoPublished in: AVSP (2010)
Keyphrases
- noisy environments
- voice activity detection
- speech recognition
- audio visual
- audio visual speech recognition
- speaker verification
- digit recognition
- multi modal
- automatic speech recognition
- multi stream
- hidden markov models
- speech signal
- visual information
- background noise
- speech synthesis
- language model
- speaker identification
- speech enhancement
- noise reduction
- pattern recognition
- visual data
- word recognition
- multimedia
- emotion recognition
- speaker recognition
- neural network
- acoustic features
- semantic information
- image processing