The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation.
Tobias DomhanEva HaslerKe TranSony TrenousBill ByrneFelix HieberPublished in: CoRR (2022)
Keyphrases
- machine translation
- information extraction
- cross lingual
- language independent
- language processing
- natural language
- natural language processing
- cross language information retrieval
- language resources
- target language
- statistical machine translation
- machine translation system
- word alignment
- parallel corpora
- source language
- chinese english
- word sense disambiguation
- brazilian portuguese
- keywords
- natural language generation
- parallel corpus