Efficient inverted index with n-gram sampling for string matching in Arabic documents.
El Moatez Billah NagoudiAhmed KhorsiHadda CherrounPublished in: AICCSA (2016)
Keyphrases
- n gram
- string matching
- inverted index
- character n grams
- language model
- pattern matching
- text classification
- variable length
- language independent
- suffix tree
- data structure
- regular expressions
- language modeling
- data mining
- word segmentation
- edit distance
- query processing
- machine learning
- document retrieval
- feature space
- database