DOCmT5: Document-Level Pretraining of Multilingual Language Models.
Chia-Hsuan LeeAditya SiddhantViresh RatnakarMelvin JohnsonPublished in: CoRR (2021)
Keyphrases
- document level
- language model
- language modeling
- cross lingual
- document retrieval
- n gram
- cross language
- probabilistic model
- retrieval model
- language independent
- query expansion
- information retrieval
- test collection
- cross language information retrieval
- query terms
- sentiment classification
- digital libraries
- relevance model
- translation model
- out of vocabulary
- chinese english
- pseudo relevance feedback
- text retrieval
- vector space model
- question answering
- co occurrence
- search engine