Feature Selection Based on Term Frequency and T-Test for Text Categorization
Deqing WangHui ZhangRui LiuWeifeng LvPublished in: CoRR (2013)
Keyphrases
- decision trees
- text categorization
- term frequency
- feature selection
- document frequency
- text classification
- information gain
- automatic text categorization
- naive bayes
- machine learning
- term weighting
- tf idf
- k nearest neighbor
- knn
- text documents
- text classifiers
- semi supervised learning
- training data
- training set
- support vector
- feature extraction
- nearest neighbor
- labeled data
- feature space
- support vector machine