OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification.
Yifan PengYui SudoMuhammad ShakeelShinji WatanabePublished in: ACL (1) (2024)
Keyphrases
- speech recognition
- speaker identification
- speech signal
- language identification
- pattern recognition
- hidden markov models
- speech processing
- speech synthesis
- probabilistic model
- language model
- multi modal
- noisy environments
- noisy speech
- neural network
- speech recognition systems
- speaker recognition
- automatic speech recognition
- gaussian mixture model