Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds.
Johan HovoldPublished in: NODALIDA (2005)
Keyphrases
- spam filtering
- naive bayes
- text classification
- classification accuracy
- decision trees
- naive bayes classifier
- classification algorithm
- uci datasets
- probabilistic classifiers
- text categorization
- naive bayesian classifier
- text classifiers
- logistic regression
- feature selection
- bayesian classifier
- uci data sets
- text mining
- document classification
- bayesian network classifiers
- term frequency
- attribute dependencies
- cost sensitive
- augmented naive bayes
- spam filters
- knn
- averaged one dependence estimators
- training data
- anti spam
- probability estimation
- bayesian networks
- labeled data
- co training
- machine learning
- nominal attributes
- attribute values
- support vector
- naive bayes classification
- feature extraction
- feature space
- association rules
- training set
- k nearest neighbor
- bayesian classifiers
- continuous attributes
- highly skewed
- support vector machine
- semi supervised
- email spam
- nearest neighbor
- databases