Fine-Grained Grounding for Multimodal Speech Recognition.
Tejas SrinivasanRamon SanabriaFlorian MetzeDesmond ElliottPublished in: EMNLP (Findings) (2020)
Keyphrases
- speech recognition
- fine grained
- coarse grained
- hidden markov models
- language model
- access control
- speech processing
- speech signal
- automatic speech recognition
- pattern recognition
- speech recognizer
- speech recognition technology
- multi modal
- speech synthesis
- noisy environments
- speech recognition systems
- speaker identification
- speaker independent
- data lineage
- information retrieval
- speech recognizers
- multi stream
- speaker dependent
- probabilistic model
- multimedia
- computer vision