U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech.
Xin JingYi ChangZijiang YangJiangjian XieAndreas TriantafyllopoulosBjörn W. SchullerPublished in: CoRR (2023)
Keyphrases
- text to speech
- speech synthesis
- computer vision
- vision system
- anisotropic diffusion
- prosodic features
- word processing
- programming tool
- real time
- fuzzy logic
- image processing
- text to speech synthesis
- genetic algorithm
- writing skills
- english text
- diffusion process
- fault diagnosis
- diffusion model
- diffusion processes
- nonlinear diffusion
- information diffusion
- multi modal
- online learning