Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech.
Haibin WuXiaofei WangSefik Emre EskimezManthan ThakkerDaniel TompkinsChung-Hsien TsaiCanrun LiZhen XiaoSheng ZhaoJinyu LiNaoyuki KandaPublished in: CoRR (2024)
Keyphrases
- text to speech
- emotional state
- speech synthesis
- matching algorithm
- text to speech synthesis
- prosodic features
- facial expressions
- emotion recognition
- image matching
- programming tool
- physiological signals
- virtual characters
- writing skills
- word processing
- flow field
- english text
- image sequences
- working memory
- human decision making
- computer science
- object recognition
- video sequences