VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks.
Soumi MaitiYifan PengShukjae ChoiJee-Weon JungXuankai ChangShinji WatanabePublished in: ICASSP (2024)
Keyphrases
- speech recognition
- acoustic models
- speech signal
- hidden markov models
- speech synthesis
- speech processing
- speech recognizer
- automatic speech recognition
- language model
- speaker identification
- speech recognition systems
- noisy environments
- speech recognition technology
- speaker independent
- speech recognition errors
- pattern recognition
- probabilistic model
- recognition engine
- isolated word
- speech recognizers
- speaker dependent
- neural network
- speaker recognition
- information retrieval
- keyword spotting
- word error rate
- text to speech
- handwriting recognition
- cepstral coefficients
- vocal tract