Minimum Latency Training of Sequence Transducers for Streaming End-to-End Speech Recognition.
Yusuke ShinoharaShinji WatanabePublished in: CoRR (2022)
Keyphrases
- end to end
- speech recognition
- wall street journal corpus
- scalable video
- isolated word
- language model
- hidden markov models
- rate adaptation
- automatic speech recognition
- acoustic models
- speech processing
- speech signal
- speech recognizer
- pattern recognition
- speech recognition technology
- speaker identification
- noisy environments
- congestion control
- speech synthesis
- application layer
- speech recognition systems
- discriminative training
- training process
- machine learning
- transport layer
- data streams