Spam Detection Using Character N-Grams.
Ioannis KanarisKonstantinos KanarisEfstathios StamatatosPublished in: SETN (2006)
Keyphrases
- spam detection
- character n grams
- n gram
- variable length
- cross language
- cross language information retrieval
- web spam
- optical character recognition
- spam filtering
- fraud detection
- text classification
- language model
- bag of words
- language specific
- document retrieval
- text retrieval
- language modeling
- document collections
- language independent
- bayesian networks
- cross lingual
- knowledge representation