Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training Alignment.
Orgest XheliliYihong LiuHinrich SchützePublished in: CoRR (2024)
Keyphrases
- language model
- pre trained
- language modeling
- n gram
- language independent
- training examples
- document retrieval
- cross language
- cross lingual
- probabilistic model
- cross language information retrieval
- training data
- speech recognition
- query expansion
- retrieval model
- language modelling
- test collection
- query terms
- smoothing methods
- information retrieval
- statistical language models
- relevance model
- training set
- control signals
- supervised learning
- translation model
- active learning
- machine learning
- language models for information retrieval
- text retrieval
- machine translation
- query translation
- appearance variations