DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis.
Yuki SaitoShinnosuke TakamichiHiroshi SaruwatariPublished in: CoRR (2019)
Keyphrases
- speech recognition
- speech synthesis
- prosodic features
- vocal tract
- speaker verification
- audio visual
- automatic speech recognition
- text to speech
- speaker recognition
- speaker diarization
- speech signal
- speaker identification
- hidden markov models
- image processing
- neural network
- image quality
- language model
- dimensionality reduction
- semi supervised
- information retrieval