Feature selection for text categorization on imbalanced data.
Zhaohui ZhengXiaoyun WuRohini K. SrihariPublished in: SIGKDD Explor. (2004)
Keyphrases
- imbalanced data
- feature selection for text categorization
- text categorization
- feature selection
- information gain
- class distribution
- linear regression
- ensemble methods
- sampling methods
- random forest
- class imbalance
- classification models
- decision trees
- support vector machine
- ensemble learning
- data mining
- ensemble classifier
- minority class
- high dimensionality
- text classification
- decision boundary
- classification accuracy
- machine learning