Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition.
Zhiyun LuYanwei PanThibault DoutreLiangliang CaoRohit PrabhavalkarChao ZhangTrevor StrohmanPublished in: CoRR (2021)
Keyphrases
- speech recognition
- wall street journal corpus
- hidden markov models
- isolated word
- automatic speech recognition
- language model
- speech understanding
- noisy environments
- speech recognizer
- recurrent neural networks
- pattern recognition
- speech processing
- speech synthesis
- acoustic models
- speech signal
- speech recognition systems
- speech recognition technology
- speech recognizers
- neural network
- speaker identification
- cepstral coefficients
- machine learning
- keyword spotting
- digital video library