On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition.
Nick RossenbachBenedikt HilmesRalf SchlüterPublished in: ASRU (2023)
Keyphrases
- automatic speech recognition
- training data
- speech recognition
- acoustic models
- speech signal
- word error rate
- hidden markov models
- spoken words
- broadcast news
- spontaneous speech
- speech synthesis
- recognition errors
- conversational speech
- noisy environments
- information retrieval
- speech corpus
- acoustic features
- relevance feedback
- word recognition
- test collection
- training process
- speech retrieval
- speech recognizer
- language model
- speaker dependent
- phoneme recognition
- speech sounds
- computer vision