Fine-Grained Grounding for Multimodal Speech Recognition.
Tejas SrinivasanRamon SanabriaFlorian MetzeDesmond ElliottPublished in: CoRR (2020)
Keyphrases
- speech recognition
- fine grained
- coarse grained
- speech recognizer
- hidden markov models
- automatic speech recognition
- speech synthesis
- language model
- pattern recognition
- speech signal
- multi modal
- access control
- speech processing
- noisy environments
- speech recognizers
- speaker identification
- speech recognition systems
- speech recognition technology
- metadata
- speaker independent
- speaker dependent
- audio visual
- signal to noise ratio
- multimedia
- computer vision