Audio-visual feature fusion via deep neural networks for automatic speech recognition.
Mohammad Hasan RahmaniFarshad AlmasganjSeyyed Ali SeyyedsalehiPublished in: Digit. Signal Process. (2018)
Keyphrases
- audio visual
- automatic speech recognition
- feature fusion
- neural network
- speech recognition
- multi modal
- multiple features
- visual information
- feature extraction
- speech signal
- pattern recognition
- hidden markov models
- broadcast news
- visual data
- multimedia
- noisy environments
- acoustic features
- nearest neighbor
- multiscale
- audio features
- data mining
- image classification
- co occurrence
- feature vectors
- image processing
- e learning