GP Classification under Imbalanced Data sets: Active Sub-sampling and AUC Approximation.
John A. DoucetteMalcolm I. HeywoodPublished in: EuroGP (2008)
Keyphrases
- imbalanced data sets
- imbalanced data
- class distribution
- minority class
- roc curve
- imbalanced class distribution
- benchmark data sets
- class imbalance
- genetic programming
- classification accuracy
- rare events
- concept learning
- decision trees
- pattern classification
- data generation
- data sets
- feature extraction
- sampling methods
- receiver operating characteristic
- machine learning
- monte carlo
- cost sensitive
- text classification
- training data
- support vector
- importance sampling
- misclassification costs
- learning algorithm
- feature selection
- machine learning methods
- training set
- supervised learning
- training samples
- classification models
- sample size
- high dimensionality