Comparison of metrics for feature selection in imbalanced text classification.
Hiroshi OguraHiromi AmanoMasato KondoPublished in: Expert Syst. Appl. (2011)
Keyphrases
- text classification
- feature selection
- text categorization
- naive bayes
- bag of words
- text mining
- text documents
- machine learning
- feature weighting
- information gain
- mutual information
- feature reduction
- multi class
- text classifiers
- data cleaning
- feature space
- labeled data
- knn
- n gram
- support vector
- multi label
- high dimensionality
- feature engineering
- classification accuracy
- chi squared
- supervised feature selection
- support vector machine
- semantic features
- class imbalance
- feature selection algorithms
- binary classification problems
- feature subset
- neural network
- model selection