METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.
Xinfa ZhuYi LeiTao LiYongmao ZhangHongbin ZhouHeng LuLei XiePublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2024)
Keyphrases
- cross lingual
- text to speech
- prosodic features
- emotion recognition
- text to speech synthesis
- speaker verification
- transfer learning
- emotional state
- speech synthesis
- audio visual
- cross lingual information retrieval
- machine translation
- cross language
- language independent
- language modeling
- text classification
- monolingual and cross lingual
- parallel corpus
- multi lingual
- translation model
- facial expressions
- word processing
- news articles
- language specific
- affective states
- language model
- query translation
- automatic speech recognition
- indian languages
- information extraction
- semi supervised
- active learning
- reinforcement learning
- machine learning
- machine translation system
- labeled data
- semi supervised learning
- speech recognition
- sentiment analysis
- document clustering