Needle in a Haystack: Reducing the Costs of Annotating Rare-Class Instances in Imbalanced Datasets.
Emily JamisonIryna GurevychPublished in: PACLIC (2014)
Keyphrases
- rare class
- imbalanced datasets
- cost sensitive learning
- cost sensitive
- misclassification costs
- class distribution
- class imbalance
- imbalanced class distribution
- decision trees
- ensemble methods
- missing values
- active learning
- minority class
- total cost
- rule extraction
- probability estimation
- sampling methods
- imbalanced data
- majority class
- training dataset
- feature selection algorithms
- supervised learning
- multi class
- data streams