Character N-Gram Tokenization for European Language Text Retrieval.
Paul McNameeJames MayfieldPublished in: Inf. Retr. (2004)
Keyphrases
- character n grams
- text retrieval
- cross language
- language neutral
- n gram
- language independent
- cross language information retrieval
- variable length
- information retrieval
- document collections
- image retrieval
- language specific
- document retrieval
- query expansion
- retrieval systems
- cross lingual
- automatic query expansion
- language model
- parallel corpora
- retrieval model
- question answering