GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement.
Yifan YangZheshu SongJianheng ZhuoMingyu CuiJinpeng LiBo YangYexing DuZiyang MaXunying LiuZiyuan WangKe LiShuai FanKai YuWei-Qiang ZhangGuoguo ChenXie ChenPublished in: CoRR (2024)
Keyphrases
- multi domain
- cross domain
- spontaneous speech
- domain specific
- statistical machine translation
- spoken dialogue systems
- search engine
- real world
- search computing
- resource discovery
- spoken language
- conversational speech
- automatic speech recognition
- role based access control
- handwriting recognition
- heterogeneous networks
- cross lingual