Multi-Class Text Classification Based in Oversampling for Highly Imbalanced Dataset.
Dário P. Dos SantosJoão Paulo J. Da CostaDaniel Alves da SilvaFábio L. L. MendonçaCarlos Eduardo Lacerda VeigaRafael T. de SousaPublished in: ICMLA (2023)
Keyphrases
- multi class
- text classification
- imbalanced datasets
- class imbalance
- feature selection
- cost sensitive
- class distribution
- cost sensitive learning
- binary classification
- support vector machine
- multi class classification
- minority class
- text categorization
- machine learning
- naive bayes
- text mining
- base classifiers
- sampling methods
- labeled data
- k nearest neighbor
- multi label
- misclassification costs
- pairwise
- feature selection algorithms
- imbalanced data
- ensemble methods
- dimensionality reduction
- knn
- decision trees
- high dimensionality
- linear classifiers
- binary classifiers
- unlabeled data
- concept drift
- classification accuracy
- training dataset
- original data
- feature space
- feature set