Bimodal variational autoencoder for audiovisual speech recognition.
Hadeer M. SayedHesham E. ElDeebShereen A. TaiePublished in: Mach. Learn. (2023)
Keyphrases
- speech recognition
- language model
- hidden markov models
- speech recognizer
- image segmentation
- pattern recognition
- automatic speech recognition
- speech understanding
- speech synthesis
- speech processing
- video retrieval
- audio visual
- visual information
- multimedia content
- speech signal
- speech recognizers
- noisy environments
- machine learning
- speaker independent
- cepstral coefficients
- speech recognition errors
- emotion recognition
- feature extraction
- computer vision
- speaker diarization
- keyword spotting
- speech retrieval
- neural network