Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition.
Zili HuangZhuo ChenNaoyuki KandaJian WuYiming WangJinyu LiTakuya YoshiokaXiaofei WangPeidong WangPublished in: CoRR (2022)
Keyphrases
- speech recognition
- speech signal
- speech synthesis
- speech processing
- speech recognizer
- hidden markov models
- automatic speech recognition
- language model
- speaker identification
- speech recognition systems
- noisy environments
- speech recognizers
- speech recognition technology
- pattern recognition
- computer vision
- speaker independent
- isolated word
- word error rate
- speaker recognition
- recognition engine
- speaker adaptation
- speech retrieval
- noisy speech
- probabilistic model
- feature selection