Integrating Unsupervised Data Generation into Self-Supervised Neural Machine Translation for Low-Resource Languages.
Dana RuiterDietrich KlakowJosef van GenabithCristina España-BonetPublished in: CoRR (2021)
Keyphrases
- machine translation
- data generation
- target language
- language independent
- cross lingual
- statistical machine translation
- multilingual documents
- grammar induction
- language resources
- machine translation system
- source language
- pos tagging
- parallel corpora
- query translation
- information extraction
- language processing
- active learning
- natural language processing
- cross language information retrieval
- data streams
- comparable corpora
- unsupervised learning
- multilingual information retrieval
- streaming data
- semi supervised
- word level
- chinese english
- word alignment
- natural language
- high throughput
- word sense disambiguation
- cross language
- bilingual dictionaries
- word order
- supervised learning
- text classification
- finite state transducers
- machine learning
- learning algorithm