Resource aware design of a deep convolutional-recurrent neural network for speech recognition through audio-visual sensor fusion.
Matthijs Van KeirsbilckBert MoonsMarian VerhelstPublished in: CoRR (2018)
Keyphrases
- speech recognition
- audio visual
- audio visual speech recognition
- recurrent neural networks
- sensor fusion
- neural network
- multi stream
- hidden markov models
- automatic speech recognition
- noisy environments
- speech signal
- visual data
- speaker identification
- real time
- language model
- multi sensor
- feed forward
- multi modal
- audio features
- speech recognition systems
- mobile robot
- low level
- pattern recognition
- visual information
- image classification