Mutual Alignment between Audiovisual Features for End-to-End Audiovisual Speech Recognition.
Hong LiuYawei WangBing YangPublished in: ICPR (2020)
Keyphrases
- end to end
- speech recognition
- speech recognition systems
- hidden markov models
- automatic speech recognition
- speech signal
- cepstral coefficients
- language model
- speech processing
- feature space
- speech synthesis
- multimedia content
- feature set
- feature extraction
- speech recognition technology
- speech recognizers
- video retrieval
- audio visual
- computer vision
- neural network
- emotion recognition
- extracting features
- speaker identification
- low level
- feature vectors
- mel frequency cepstral coefficients
- mobile devices
- pattern recognition
- speech retrieval
- feature selection