The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation.
Tobias DomhanEva HaslerKe TranSony TrenousBill ByrneFelix HieberPublished in: NAACL-HLT (2022)
Keyphrases
- machine translation
- language processing
- natural language processing
- cross lingual
- cross language information retrieval
- information extraction
- language independent
- chinese english
- word sense disambiguation
- target language
- natural language
- natural language generation
- language resources
- machine translation system
- word alignment
- statistical machine translation
- word level
- parallel corpora
- parallel corpus
- keywords
- brazilian portuguese
- out of vocabulary
- source language
- finite state transducers
- statistical translation models