An Unsupervised Method to Select a Speaker Subset from Large Multi-Speaker Speech Synthesis Datasets.
Pilar Oplustil GallegosJennifer WilliamsJoanna RownickaSimon KingPublished in: INTERSPEECH (2020)
Keyphrases
- speech recognition
- detection method
- speech synthesis
- preprocessing
- clustering method
- cost function
- dynamic programming
- learning algorithm
- image processing
- high accuracy
- speaker recognition
- speaker verification
- segmentation method
- computational cost
- classification method
- prosodic features
- data sets
- supervised learning
- hidden markov models
- significant improvement
- training set
- objective function
- computer vision
- neural network