DOCmT5: Document-Level Pretraining of Multilingual Language Models.
Chia-Hsuan LeeAditya SiddhantViresh RatnakarMelvin JohnsonPublished in: NAACL-HLT (Findings) (2022)
Keyphrases
- document level
- language model
- language modeling
- cross lingual
- n gram
- language independent
- document retrieval
- probabilistic model
- query expansion
- information retrieval
- test collection
- retrieval model
- cross language
- digital libraries
- sentiment classification
- pseudo relevance feedback
- query terms
- cross language information retrieval
- relevance model
- translation model
- vector space model
- retrieval effectiveness
- information retrieval systems
- query translation
- machine learning