Spam Filtering Using Inexact String Matching in Explicit Feature Space with On-Line Linear Classifiers.
David SculleyGabriel WachmanCarla E. BrodleyPublished in: TREC (2006)
Keyphrases
- spam filtering
- string matching
- linear classifiers
- feature space
- hyperplane
- pattern matching
- edit distance
- multi class
- text classification
- principal components
- kernel function
- data points
- feature vectors
- suffix tree
- input space
- svm classifier
- regular expressions
- spam filters
- generalization error
- training samples
- principal component analysis
- support vector machine
- high dimensional
- training set
- input data
- classification accuracy
- feature extraction
- feature selection
- computer vision
- dimensionality reduction
- probabilistic model
- multi task
- pattern recognition