How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?
Ali AraabiChristof MonzVlad NiculaePublished in: AMTA (2022)
Keyphrases
- machine translation
- out of vocabulary
- cross language information retrieval
- language specific
- chinese english
- parallel corpora
- cross lingual
- english chinese
- word level
- word sense disambiguation
- word segmentation
- language independent
- language model
- n gram
- spoken document retrieval
- natural language processing
- query translation
- word alignment
- language processing
- broadcast news
- named entity recognition
- machine translation system
- parallel corpus
- target language
- statistical machine translation
- cross language
- information extraction
- query words
- natural language
- bilingual dictionaries
- source language
- word recognition
- search engine
- query terms
- named entities