ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech Synthesis.
Haobin TangXulong ZhangNing ChengJing XiaoJianzong WangPublished in: ICASSP (2024)
Keyphrases
- speech synthesis
- cross domain
- text to speech
- emotional state
- emotion recognition
- multiscale
- facial expressions
- speech recognition
- prosodic features
- affect sensing
- knowledge transfer
- multiple domains
- affective states
- decision trees
- vocal tract
- text categorization
- sentiment classification
- domain adaptation
- audio visual
- transfer learning
- image segmentation