-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis.
Jinlong XueYayue DengFengping WangYa LiYingming GaoJianhua TaoJianqing SunJiaen LiangPublished in: ICASSP (2023)
Keyphrases
- end to end
- multi modal
- text to speech synthesis
- multiscale
- text to speech
- cross modal
- wireless ad hoc networks
- multi modality
- high dimensional
- image processing
- admission control
- wavelet transform
- image representation
- audio visual
- congestion control
- edge detection
- wavelet coefficients
- image segmentation
- real world
- uni modal
- transport layer
- internet protocol
- multiple modalities
- low level
- web services