Exploring Pre-Training with Alignments for RNN Transducer Based End-to-End Speech Recognition.
Hu HuRui ZhaoJinyu LiLiang LuYifan GongPublished in: ICASSP (2020)
Keyphrases
- end to end
- speech recognition
- wall street journal corpus
- hidden markov models
- isolated word
- recurrent neural networks
- speech processing
- acoustic models
- automatic speech recognition
- speech signal
- language model
- speech recognizer
- speech recognition technology
- pattern recognition
- noisy environments
- speech synthesis
- congestion control
- speaker identification
- training set
- speaker independent
- audio visual speech recognition
- training process
- image quality
- speaker dependent
- feature set
- computational complexity