Comparison of feature selection methods in text classification on highly skewed datasets.
Muhammad Nabeel AsimMuhammad WasimMuhammad Sajid AliAbdur RehmanPublished in: INTELLECT (2017)
Keyphrases
- text classification
- highly skewed
- naive bayes
- imbalanced datasets
- text categorization
- imbalanced data
- class distribution
- text mining
- machine learning
- unlabeled data
- training dataset
- benchmark datasets
- data cleaning
- feature selection
- multi label
- feature selection algorithms
- class imbalance
- misclassification costs
- sampling methods