DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis.
Yuki SaitoShinnosuke TakamichiHiroshi SaruwatariPublished in: SSW (2019)
Keyphrases
- speech recognition
- speech synthesis
- prosodic features
- speaker verification
- vocal tract
- speaker recognition
- audio visual
- automatic speech recognition
- hidden markov models
- speaker diarization
- similarity measure
- text to speech
- speaker identification
- pattern recognition
- information retrieval
- noisy environments
- semi supervised
- synthesized speech