Karaoker: Alignment-free singing voice synthesis with speech training data.
Panagiotis KakoulidisNikolaos EllinasGeorgios VamvoukakisKonstantinos MarkopoulosJune Sig SungGunu JhoPirros TsiakoulisAimilios ChalamandarisPublished in: INTERSPEECH (2022)
Keyphrases
- training data
- text to speech
- speech synthesis
- emotion recognition
- speech quality
- speech recognition errors
- voice activity detection
- speech recognition
- speech sounds
- speech signal
- learning algorithm
- audio features
- acoustic features
- fundamental frequency
- audio visual
- automatic speech recognition
- training examples
- data sets
- training process
- prosodic features
- classification models
- supervised learning
- decision trees
- test data
- training set
- program synthesis
- music information retrieval
- acoustic models
- synthesized speech
- training instances
- facial animation
- test set
- classification accuracy
- sequence alignment
- texture synthesis
- prior knowledge
- face recognition
- learned from training data
- training dataset
- image alignment
- pairwise
- support vector machine
- dynamic time warping
- spoken language
- multi modal
- feature selection
- training samples
- labeled data