Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages.
Ayyoob ImaniPeiqin LinAmir Hossein KargaranSilvia SeveriniMasoud Jalili SabetNora KassnerChunlan MaHelmut SchmidAndré F. T. MartinsFrançois YvonHinrich SchützePublished in: ACL (1) (2023)
Keyphrases
- language model
- language modeling
- comparable corpora
- cross lingual
- statistical machine translation
- language independent
- parallel corpus
- n gram
- parallel corpora
- translation model
- chinese english
- linguistic resources
- cross language information retrieval
- language modelling
- document retrieval
- cross lingual information retrieval
- retrieval model
- cross language
- probabilistic model
- machine translation system
- query expansion
- information retrieval
- speech recognition
- statistical language models
- query terms
- test collection
- text retrieval
- pseudo relevance feedback
- document level
- language model for information retrieval
- relevance model
- query translation
- bilingual dictionaries
- word segmentation
- vector space model
- context sensitive
- machine translation
- language models for information retrieval
- smoothing methods
- statistical language modeling
- natural language processing