Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types.
Kevin De AngeliShang GaoIoana DanciuEric B. DurbinXiao-Cheng WuAntoinette StroupJennifer A. DohertyStephen M. SchwartzCharles WigginsMark DamesynLinda CoyleLynne PenberthyGeorgia D. TourassiHong-Jun YoonPublished in: J. Biomed. Informatics (2022)
Keyphrases
- class imbalance
- imbalanced datasets
- binary classification problems
- sampling methods
- class distribution
- active learning
- cost sensitive learning
- cost sensitive
- rare events
- class noise
- imbalanced data
- majority class
- class imbalanced
- high dimensionality
- imbalanced class distribution
- feature selection
- imbalanced data sets
- benchmark datasets
- concept drift
- classification accuracy
- data sets
- minority class
- training set
- pattern recognition
- rare class
- decision trees
- training dataset
- misclassification costs
- classification models
- pattern classification
- class labels
- training samples
- image classification
- multi class
- learning algorithm