Mask-CTC-based Encoder Pre-training for Streaming End-to-End Speech Recognition.
Huaibo ZhaoYosuke HiguchiYusuke KidaTetsuji OgawaTetsunori KobayashiPublished in: CoRR (2023)
Keyphrases
- end to end
- speech recognition
- scalable video
- isolated word
- hidden markov models
- rate adaptation
- language model
- acoustic models
- speech recognizer
- speech recognition systems
- speech signal
- speech recognition technology
- automatic speech recognition
- speech synthesis
- pattern recognition
- rate allocation
- noisy environments
- information retrieval
- speaker identification
- bit rate
- training set
- video sequences
- bitstream
- rate distortion
- congestion control
- visual features
- probabilistic model
- data streams
- machine learning