Combining n-Grams and Stemming for Arabic Word-Based Inexact Matching and Term Conflation.
Suleiman H. MustafaPublished in: J. Inf. Knowl. Manag. (2005)
Keyphrases
- n gram
- inexact matching
- structural descriptions
- character n grams
- language model
- variable length
- document frequency
- text classification
- language independent
- bag of words
- word segmentation
- representation scheme
- language modeling
- out of vocabulary
- language specific
- part of speech
- inside outside algorithm
- word level
- handwritten words
- web documents