Words versus Character n-Grams for Anti-Spam Filtering.
Ioannis KanarisKonstantinos KanarisIoannis HouvardasEfstathios StamatatosPublished in: Int. J. Artif. Intell. Tools (2007)
Keyphrases
- character n grams
- n gram
- anti spam filtering
- variable length
- cross language
- cross language information retrieval
- language specific
- optical character recognition
- bag of words
- arabic documents
- language model
- text classification
- language independent
- mailing lists
- memory based learning
- anti spam
- text documents
- language modeling
- query expansion
- digital libraries
- information extraction