Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction.
Shubhanshu MishraAria HaghighiPublished in: W-NUT (2021)
Keyphrases
- language model
- translation model
- machine translation system
- social media
- language modeling
- statistical machine translation
- information retrieval
- cross lingual
- cross language retrieval
- document retrieval
- cross language
- cross language information retrieval
- n gram
- comparable corpora
- chinese english
- text retrieval
- retrieval model
- speech recognition
- probabilistic model
- multiword
- mixture model
- query expansion
- language modelling
- out of vocabulary
- document level
- test collection
- smoothing methods
- query translation
- query terms
- ad hoc information retrieval
- language independent
- bilingual dictionaries
- vector space model
- context sensitive
- machine translation
- cross lingual information retrieval
- word level
- parallel corpora
- text mining
- social networks
- digital libraries
- relevance model
- pseudo relevance feedback
- information access
- retrieval effectiveness
- retrieval systems
- document collections
- text classification
- co occurrence
- keywords