Controllable cross-speaker emotion transfer for end-to-end speech synthesis.
Tao LiXinsheng WangQicong XieZhichao WangLei XiePublished in: CoRR (2021)
Keyphrases
- end to end
- speech synthesis
- speech recognition
- prosodic features
- vocal tract
- text to speech
- automatic speech recognition
- language model
- wireless ad hoc networks
- admission control
- congestion control
- hidden markov models
- ad hoc networks
- high bandwidth
- facial expressions
- multipath
- speech signal
- pattern recognition
- noisy environments
- internet protocol
- image processing
- text localization and recognition
- speaker verification
- application layer
- video coding
- face recognition