mSLAM: Massively multilingual joint pre-training for speech and text.
Ankur BapnaColin CherryYu ZhangYe JiaMelvin JohnsonYong ChengSimran KhanujaJason RiesaAlexis ConneauPublished in: CoRR (2022)
Keyphrases
- multi lingual
- text generation
- text to speech
- english text
- text to speech synthesis
- hearing impaired
- spontaneous speech
- natural language generation
- training process
- text input
- language independent
- training set
- speech recognition
- text retrieval
- cross lingual
- information retrieval
- lexical features
- text recognition
- text mining
- massively parallel
- language model
- speech signal
- audio visual
- database
- information access
- question answering
- web documents
- supervised learning
- digital libraries
- keywords
- text documents