When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages.
Tyler A. ChangCatherine ArnettZhuowen TuBenjamin K. BergenPublished in: CoRR (2023)
Keyphrases
- language modeling
- cross lingual
- language model
- comparable corpora
- language independent
- retrieval model
- information retrieval
- n gram
- query expansion
- probabilistic model
- high dimensional
- text classification
- high dimensional data
- document retrieval
- statistical machine translation
- cross language
- dimensionality reduction
- pseudo feedback
- improvements in retrieval effectiveness
- statistical language models
- query translation
- relevance model
- word segmentation
- grammatical inference
- parallel corpora
- linguistic resources
- data mining
- sentence retrieval
- finite state transducers
- information extraction
- test collection
- language modeling framework
- mixture model