JHU/APL Experiments in Tokenization and Non-word Translation.
Paul McNameeJames MayfieldPublished in: CLEF (2003)
Keyphrases
- character n grams
- n gram
- cross language information retrieval
- statistical machine translation
- translation model
- variable length
- cross language
- machine translation system
- machine translation
- english words
- target language
- parallel corpus
- query translation
- bilingual dictionaries
- word alignment
- named entities
- co occurrence
- out of vocabulary
- word level
- word recognition
- english chinese
- named entity recognizer
- source language
- training corpus
- word pairs
- text retrieval
- information extraction
- language modeling
- character recognition
- word sense disambiguation
- language specific
- query words
- language model
- feature selection
- information retrieval