Beyond Shared Vocabulary: Increasing Representational Word Similarities across Languages for Multilingual Machine Translation.
Di WuChristof MonzPublished in: EMNLP (2023)
Keyphrases
- machine translation
- language specific
- cross lingual
- language independent
- out of vocabulary
- target language
- machine translation system
- multilingual documents
- statistical machine translation
- cross language information retrieval
- bilingual dictionaries
- word sense disambiguation
- language resources
- parallel corpus
- word level
- parallel corpora
- language processing
- natural language processing
- translation model
- word alignment
- information extraction
- chinese english
- natural language
- query translation
- cross lingual information retrieval
- multilingual information retrieval
- comparable corpora
- n gram
- indian languages
- natural language generation
- source language
- english chinese
- tasks in natural language processing
- word segmentation
- cross language
- bilingual lexicon
- keywords
- word pairs
- text classification
- linguistic knowledge
- sentiment classification
- language model
- machine learning
- statistical translation models