Language Contamination Helps Explains the Cross-lingual Capabilities of English Pretrained Models.
Terra BlevinsLuke ZettlemoyerPublished in: EMNLP (2022)
Keyphrases
- cross lingual
- parallel corpus
- machine translation
- language specific
- european languages
- translation model
- language modeling
- cross lingual information retrieval
- cross language
- event extraction
- language independent
- indian languages
- machine translation system
- natural language
- source language
- word alignment
- linguistic resources
- target language
- query translation
- text classification
- probabilistic model
- cross language information retrieval
- comparable corpora
- monolingual retrieval
- mono lingual
- statistical machine translation
- bilingual dictionaries
- news articles
- character n grams
- artificial intelligence