Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining.
Nikola LjubesicVít SuchomelPeter RupnikTaja KuzmanRik van NoordPublished in: CoRR (2024)
Keyphrases
- closely related
- language model
- cost efficient
- language modeling
- probabilistic model
- n gram
- speech recognition
- document retrieval
- statistical language models
- language modelling
- query expansion
- retrieval model
- cross lingual
- information retrieval
- context sensitive
- special case
- language independent
- query specific
- ad hoc information retrieval
- spoken term detection
- language models for information retrieval
- smoothing methods
- term dependencies
- text summarization
- vector space model
- np hard
- lower bound
- objective function