ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech.
Xiaoran FanChao PangTian YuanHe BaiRenjie ZhengPengfei ZhuShuohuan WangJunkun ChenZeyu ChenLiang HuangYu SunHua WuPublished in: CoRR (2022)
Keyphrases
- text to speech
- cross lingual
- prosodic features
- speech synthesis
- text to speech synthesis
- machine translation
- english text
- language independent
- language modeling
- cross lingual information retrieval
- multi lingual
- cross language
- word processing
- parallel corpora
- text classification
- text mining
- translation model
- document clustering
- mono lingual
- audio visual
- word sense
- news articles
- text documents
- language model
- text retrieval