Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis.
Jennifer WilliamsJoanna RownickaPilar OplustilSimon KingPublished in: Odyssey (2020)
Keyphrases
- text to speech synthesis
- quality estimation
- text to speech
- prosodic features
- speech recognition
- speaker recognition
- audio visual
- automatic speech recognition
- speaker verification
- speaker identification
- speech synthesis
- quality metrics
- speaker dependent
- low level
- information retrieval
- broadcast news
- emotion recognition
- vocal tract
- speech signal
- action recognition
- synthesized speech