Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech.
Dong YangTomoki KoriyamaYuki SaitoTakaaki SaekiDetai XinHiroshi SaruwatariPublished in: ICASSP (2023)
Keyphrases
- language model
- text to speech
- pre trained
- prosodic features
- speech recognition
- speech synthesis
- language modeling
- n gram
- probabilistic model
- query expansion
- speaker verification
- automatic speech recognition
- retrieval model
- test collection
- information retrieval
- context sensitive
- ad hoc information retrieval
- mixture model
- training examples
- smoothing methods
- translation model
- control signals
- training data
- noisy environments
- relevance model
- statistical model
- pattern recognition
- multimedia