Fusing Acoustic and Text Emotional Features for Expressive Speech Synthesis.
Ying FengPengfei DuanYunfei ZiYaxiong ChenShengwu XiongPublished in: ICME (2022)
Keyphrases
- speech synthesis
- text to speech
- feature extraction
- emotion recognition
- speech recognition
- emotional speech
- prosodic features
- structural features
- text retrieval
- database
- feature set
- image features
- co occurrence
- semantic information
- extracted features
- additional features
- feature fusion
- information retrieval
- facial expressions
- low level
- feature vectors
- pattern recognition
- video sequences
- speech recognition systems
- image processing