Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi.
Md. Nishat RaihanDhiman GoswamiAntara MahmudPublished in: CoRR (2023)
Keyphrases
- language modeling
- cross lingual
- language model
- statistical machine translation
- indian languages
- comparable corpora
- retrieval model
- information retrieval
- machine translation
- n gram
- probabilistic model
- query expansion
- machine learning
- cross language
- language independent
- query translation
- language identification
- query processing
- word segmentation
- natural language
- search engine
- statistical language models