Training SVM email classifiers using very large imbalanced dataset.
Lili DiaoChengzhong YangHao WangPublished in: J. Exp. Theor. Artif. Intell. (2012)
Keyphrases
- imbalanced datasets
- training set
- support vector
- class distribution
- linear svm
- training data
- training dataset
- svm classifier
- training examples
- test set
- rare class
- training samples
- imbalanced data
- decision trees
- training process
- support vector machine
- classification algorithm
- cost sensitive learning
- feature selection algorithms
- feature selection
- sampling methods
- decision boundary
- support vector machine svm
- support vectors
- minority class
- class imbalance
- hyperplane
- ensemble classifier
- data sets
- nearest neighbor
- supervised learning
- cross validation
- naive bayes
- multi class
- text mining
- unlabeled data
- text categorization
- classification accuracy
- prediction accuracy
- active learning
- feature vectors
- binary classification
- feature space
- ensemble methods
- loss function
- binary classifiers
- knn
- feature set
- ensemble learning
- classification models
- machine learning