Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training.
Shaohuan ZhouXu LiZhiyong WuYing ShanHelen MengPublished in: CoRR (2023)
Keyphrases
- music information retrieval
- supervised learning
- acoustic features
- emotion recognition
- wide range
- test set
- speech recognition
- audio visual
- training set
- speech signal
- unsupervised learning
- semi supervised
- training algorithm
- automatic speech recognition
- feature vectors
- feature space
- unsupervised manner
- supervised training
- visual features
- mel frequency cepstral coefficients
- training process
- training examples
- data sets
- online learning
- data driven
- hidden markov models
- video sequences
- training data
- face recognition
- information retrieval