Towards minimum perceptual error training for DNN-based speech synthesis.
Cassia Valentini-BotinhaoZhizheng WuSimon KingPublished in: INTERSPEECH (2015)
Keyphrases
- speech synthesis
- training process
- speech recognition
- text to speech
- vocal tract
- error rate
- training set
- prosodic features
- test set
- training phase
- human perception
- data sets
- low level
- training algorithm
- speech corpus
- feedforward neural networks
- visual perception
- training examples
- training samples
- upper bound
- decision trees