Mask-CTC-Based Encoder Pre-Training for Streaming End-to-End Speech Recognition.
Huaibo ZhaoYosuke HiguchiYusuke KidaTetsuji OgawaTetsunori KobayashiPublished in: EUSIPCO (2023)
Keyphrases
- end to end
- speech recognition
- scalable video
- isolated word
- hidden markov models
- automatic speech recognition
- rate adaptation
- speech synthesis
- acoustic models
- speech signal
- language model
- speech recognition technology
- speech recognition systems
- rate allocation
- congestion control
- noisy environments
- speech recognizer
- speaker identification
- pattern recognition
- training set
- image processing
- speaker independent
- transport layer
- data streams
- discriminative training
- error control
- rate distortion
- computer vision