Karaoker: Alignment-free singing voice synthesis with speech training data.
Panos KakoulidisNikolaos EllinasGeorgios VamvoukakisKonstantinos MarkopoulosJune Sig SungGunu JhoPirros TsiakoulisAimilios ChalamandarisPublished in: CoRR (2022)
Keyphrases
- training data
- text to speech
- emotion recognition
- speech synthesis
- voice activity detection
- speech recognition errors
- speech recognition
- speech quality
- data sets
- decision trees
- audio features
- facial animation
- speech sounds
- fundamental frequency
- learning algorithm
- prosodic features
- supervised learning
- audio visual
- speech signal
- synthesized speech
- test data
- image alignment
- training examples
- training set
- classification accuracy
- training process
- facial expressions
- labeled data
- automatic speech recognition
- training dataset
- text to speech synthesis
- training samples
- acoustic features
- acoustic models
- recognition engine
- dialogue system
- learned from training data
- program synthesis
- spectral features
- speaker verification
- broadcast news
- training instances
- sequence alignment
- prior knowledge
- machine learning