Enhancing the Vocal Range of Single-Speaker Singing Voice Synthesis with Melody-Unsupervised Pre-Training.
Shaohuan ZhouXu LiZhiyong WuYing ShanHelen MengPublished in: ICASSP (2023)
Keyphrases
- music information retrieval
- acoustic features
- supervised learning
- emotion recognition
- wide range
- supervised training
- data driven
- unsupervised learning
- speaker verification
- music retrieval
- program synthesis
- multi modal
- training phase
- dynamic time warping
- mel frequency cepstral coefficients
- speech signal
- audio visual
- information retrieval
- speech recognition
- test set
- facial expressions
- semi supervised
- similarity measure