An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition.
Devesh WalawalkarYihui HeRohit PillaiPublished in: CoRR (2018)
Keyphrases
- speech recognition
- audio visual speech recognition
- audio visual
- multi stream
- multi modal
- speech synthesis
- automatic speech recognition
- pattern recognition
- hidden markov models
- acoustic models
- speech signal
- visual data
- language model
- probabilistic model
- speaker identification
- speech recognizer
- data mining
- noisy environments
- emotion recognition
- image data
- speech recognition systems
- high level