Reversing Morphological Tokenization in English-to-Arabic SMT.
Mohammad SalamehColin CherryGrzegorz KondrakPublished in: HLT-NAACL (2013)
Keyphrases
- word forms
- statistical machine translation
- arabic language
- character n grams
- language independent
- language identification
- machine translation
- morphological analysis
- mt evaluation
- phrase based smt
- n gram
- machine translation system
- cross language information retrieval
- syntactic categories
- english language
- arabic documents
- unknown words
- named entities
- image processing
- multiscale
- mathematical morphology
- cross language
- multiword
- structuring elements
- biomedical text
- word alignment
- grammar induction
- biomedical information retrieval
- cross lingual
- source language
- query translation
- optical character recognition
- language model
- natural language
- text retrieval
- answer questions
- training corpus
- language learning