MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation.
Yifan PengIlia KulikovYilin YangSravya PopuriHui LuChanghan WangHongyu GongPublished in: CoRR (2024)
Keyphrases
- speech recognition
- language model
- automatic speech recognition
- word error rate
- speech signal
- audio visual
- language modeling
- multi task
- document retrieval
- probabilistic model
- unsupervised learning
- statistical machine translation
- retrieval model
- machine learning
- transfer learning
- out of vocabulary
- smoothing methods
- cross language retrieval
- hidden markov models