How Much Does Tokenization Affect Neural Machine Translation?
Miguel DomingoMercedes García-MartínezAlexandre HelleFrancisco CasacubertaManuel HerranzPublished in: CoRR (2018)
Keyphrases
- machine translation
- cross lingual
- natural language processing
- information extraction
- named entities
- language independent
- chinese english
- natural language
- language resources
- word sense disambiguation
- cross language information retrieval
- target language
- statistical machine translation
- natural language generation
- language processing
- word alignment
- associative memory
- brazilian portuguese
- machine translation system
- word level
- parallel corpora
- parallel corpus
- query translation
- statistical translation models