Bimodal Speech Recognition Fusing Audio-Visual Modalities.
Alexey KarpovAlexander L. RonzhinIrina S. KipyatkovaAndrey RonzhinVasilisa VerkhodanovaAnton I. SavelievMilos ZeleznýPublished in: HCI (2) (2016)
Keyphrases
- audio visual
- speech recognition
- multimodal fusion
- emotion recognition
- visual data
- multi modal
- audio visual speech recognition
- visual information
- language model
- hidden markov models
- speech recognizer
- pattern recognition
- speech signal
- multi stream
- automatic speech recognition
- multimedia
- speaker verification
- noisy environments
- audio features
- speech recognition systems
- image processing
- speaker identification
- image sequences
- visual content
- visual features