Vocoder-free text-to-speech synthesis incorporating generative adversarial networks using low-/multi-frequency STFT amplitude spectra.
Yuki SaitoShinnosuke TakamichiHiroshi SaruwatariPublished in: Comput. Speech Lang. (2019)
Keyphrases
- text to speech synthesis
- generative model
- short time fourier transform
- power spectra
- frequency response
- instantaneous frequency
- network analysis
- computer networks
- network structure
- network design
- complex networks
- frequency domain
- principal component analysis
- social networks
- real time
- unsupervised learning
- data driven
- text mining
- network topologies
- network size
- prior knowledge
- multiscale
- data mining
- data sets