Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT.
Yining WangLong ZhouJiajun ZhangChengqing ZongPublished in: CWMT (2017)
Keyphrases
- chinese english
- out of vocabulary
- language model
- n gram
- word segmentation
- spoken document retrieval
- cross language information retrieval
- broadcast news
- named entity recognition
- translation model
- statistical machine translation
- parallel corpora
- machine translation
- wordnet
- hand crafted
- query terms
- named entities
- word sense disambiguation
- cross lingual
- language modeling
- cross language
- term frequency
- language independent
- query translation
- word recognition
- probabilistic model
- co occurrence
- information extraction
- search engine
- test collection
- knowledge base
- linguistic resources
- bilingual dictionaries
- semi supervised
- text classification
- question answering
- word pairs