Singing Voice Phoneme Segmentation by Hierarchically Inferring Syllable and Phoneme Onset Positions.
Rong GongXavier SerraPublished in: INTERSPEECH (2018)
Keyphrases
- speech synthesis
- prosodic features
- speech recognition
- text to speech
- image segmentation
- level set
- segmentation algorithm
- speech sounds
- context dependent
- automatic speech recognition
- multiscale
- segmentation method
- motion segmentation
- optimal segmentation
- segmentation accuracy
- hidden markov models
- multiresolution
- fully unsupervised
- position information
- neural network
- hierarchical structure
- x ray
- automatic speech recognition systems