Casting to Corpus: Segmenting and Selecting Spontaneous Dialogue for Tts with a Cnn-lstm Speaker-dependent Breath Detector.
Éva SzékelyGustav Eje HenterJoakim GustafsonPublished in: ICASSP (2019)
Keyphrases
- speaker dependent
- conversational speech
- speech recognition
- phoneme recognition
- speaker identification
- text to speech
- image segmentation
- speech synthesis
- hidden markov models
- recurrent neural networks
- dialogue system
- information retrieval
- natural language
- extracting features
- text data
- gaussian mixture model
- mixture model
- neural network