SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech.
Byoung Jin ChoiMyeonghun JeongJoun Yeop LeeNam Soo KimPublished in: CoRR (2022)
Keyphrases
- prosodic features
- text to speech
- speech synthesis
- speaker verification
- speaker recognition
- audio visual
- speech recognition
- text to speech synthesis
- automatic speech recognition
- multi layer
- speaker identification
- management system
- programming tool
- real time
- english text
- image sequences
- speaker diarization
- flow patterns
- hierarchical architecture
- neural network