TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline.
Chengfei LiShuhao DengYaoping WangGuangjing WangYaguang GongChangbin ChenJinfeng BaiPublished in: CoRR (2022)
Keyphrases
- speech recognition
- open source
- source code
- speech recognition technology
- broadcast news
- statistical machine translation
- automatic speech recognition
- speech recognizer
- word error rate
- language model
- speaker independent
- speaker identification
- conversational speech
- parallel corpus
- hidden markov models
- language identification
- speech signal
- speech synthesis
- speech processing
- linguistic features
- pattern recognition
- machine translation
- isolated word
- noisy environments
- speech retrieval
- handwriting recognition
- english language
- cross language
- speech recognition systems
- natural language
- machine translation system
- text to speech
- language learning
- word sense
- english text
- parallel corpora
- cross lingual
- language modeling
- cross language information retrieval
- text classification
- information retrieval