Improving RNN-T ASR Accuracy Using Context Audio.
Andreas SchwarzIlya SklyarSimon WieslerPublished in: Interspeech (2021)
Keyphrases
- high accuracy
- computational cost
- recurrent neural networks
- classification accuracy
- nearest neighbor
- signal processing
- error rate
- prediction accuracy
- contextual information
- speech recognition
- computational efficiency
- visual information
- correlation coefficient
- audio visual
- search engine
- visual data
- context dependent
- multi modal
- decision trees