Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks.
Soumi MaitiYifan PengShukjae ChoiJee-weon JungXuankai ChangShinji WatanabePublished in: CoRR (2023)
Keyphrases
- speech recognition
- acoustic models
- speech synthesis
- automatic speech recognition
- hidden markov models
- language model
- speech signal
- speech processing
- speech recognizer
- speaker independent
- probabilistic model
- speech recognition systems
- speaker identification
- speech recognizers
- noisy environments
- word error rate
- keyword spotting
- pattern recognition
- speech recognition technology
- neural network
- handwriting recognition
- speech retrieval
- text to speech
- vocal tract
- recognition engine
- speaker adaptation
- speaker recognition
- speaker dependent
- speech recognition errors
- noisy speech