Feature extraction using multimodal convolutional neural networks for visual speech recognition.
Eric TatulliThomas HueberPublished in: ICASSP (2017)
Keyphrases
- visual speech recognition
- convolutional neural networks
- feature extraction
- visual speech
- speaker identification
- hidden markov models
- lip reading
- local binary pattern
- wavelet transform
- face recognition
- speech recognition
- dynamic textures
- pattern recognition
- multi modal
- extracted features
- principal component analysis
- feature vectors
- image classification
- texture features
- image processing
- audio visual
- spatio temporal
- frequency domain
- feature selection
- texture analysis
- texture classification
- facial expression recognition
- multiscale