JHU/APL Experiments in Tokenization and Non-Word Translation.
Paul McNameeJames MayfieldPublished in: CLEF (Working Notes) (2003)
Keyphrases
- character n grams
- n gram
- cross language information retrieval
- translation model
- cross language
- statistical machine translation
- variable length
- machine translation system
- named entities
- query translation
- co occurrence
- machine translation
- out of vocabulary
- language model
- parallel corpus
- bilingual dictionaries
- word alignment
- english words
- word level
- parallel corpora
- query words
- word recognition
- word segmentation
- chinese english
- english chinese
- noun phrases
- word sense disambiguation
- text classification
- named entity recognizer