Hierarchical Timbre-Cadence Speaker Encoder for Zero-shot Speech Synthesis.
Joun Yeop LeeJae-Sung BaeSeongkyu MunJihwan LeeJi-Hyun LeeHoon-Young ChoChanwoo KimPublished in: INTERSPEECH (2023)
Keyphrases
- speech synthesis
- speech recognition
- prosodic features
- vocal tract
- text to speech
- automatic speech recognition
- hidden markov models
- speaker identification
- bit rate
- low complexity
- language model
- rate distortion
- speech signal
- hierarchical structure
- speech corpus
- speaker diarization
- video codec
- video compression
- acoustic features
- distributed video coding