A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization.
Jieming YangYuanning LiuXiaodong ZhuZhen LiuXiaoxu ZhangPublished in: Inf. Process. Manag. (2012)
Keyphrases
- text categorization
- feature selection
- training documents
- text classification
- classify documents
- information gain
- multi label
- automated text categorization
- knn
- naive bayes
- k nearest neighbor
- text filtering
- feature generation
- feature weighting
- text documents
- automatic text categorization
- feature selections
- text classifiers
- feature subset
- tf idf
- semi supervised learning
- data sets
- unlabeled data
- classification accuracy
- feature space
- support vector
- reuters corpus
- term frequency
- term weighting
- document frequency
- reinforcement learning
- feature selection and classifier
- decision trees
- machine learning