Classification of Imbalanced Documents by Feature Selection.
Yusuke AdachiNaoya OnimuraTakanori YamashitaSachio HirokawaPublished in: ICCDA (2017)
Keyphrases
- feature selection
- document classification
- classification accuracy
- text classification
- class imbalance
- classification models
- support vector
- feature set
- feature extraction
- feature space
- feature selection algorithms
- information retrieval systems
- machine learning
- unsupervised learning
- high dimensionality
- support vector machine
- text categorization
- image classification
- method for feature selection
- irrelevant features
- binary classification problems
- discriminative features
- text classifiers
- pre classified
- imbalanced datasets
- imbalanced data
- decision trees
- multi task
- text documents
- classification method
- classification algorithm
- document collections
- feature subset
- automatic categorization
- document clustering
- automatic classification
- classification performances
- cost sensitive
- web documents
- feature ranking
- support vector machine svm
- feature selection and classification
- redundant features
- relevant documents
- knn
- metadata
- class distribution
- training data
- information gain
- xml documents