Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT.
Yining WangLong ZhouJiajun ZhangChengqing ZongPublished in: CoRR (2017)
Keyphrases
- chinese english
- out of vocabulary
- n gram
- language model
- word segmentation
- spoken document retrieval
- cross language information retrieval
- broadcast news
- named entity recognition
- translation model
- statistical machine translation
- parallel corpora
- wordnet
- machine translation
- cross lingual
- hand crafted
- word sense disambiguation
- query terms
- language modeling
- language independent
- linguistic resources
- text classification
- term frequency
- query translation
- named entities
- document retrieval
- natural language processing
- bilingual dictionaries
- speech recognition
- retrieval model
- information extraction
- test collection
- word recognition
- co occurrence