Improving Seq2Seq TTS Frontends With Transcribed Speech Audio.
Siqi SunKorin RichmondHao TangPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2023)
Keyphrases
- text to speech
- prosodic features
- speech synthesis
- spontaneous speech
- audio visual
- audio stream
- broadcast news
- multimodal interaction
- audio signals
- cepstral features
- speech recognition
- emotion recognition
- word processing
- digital audio
- neural network
- speaker identification
- speaker verification
- human machine interaction
- spoken language
- automatic transcription
- multimedia