Comparison of term frequency and document frequency based feature selection metrics in text categorization.
Nouman AzamJingtao YaoPublished in: Expert Syst. Appl. (2012)
Keyphrases
- text categorization
- document frequency
- term frequency
- feature selection
- information gain
- text classification
- tf idf
- text documents
- k nearest neighbor
- knn
- term weighting
- naive bayes
- semi supervised learning
- bag of words
- feature subset
- data mining
- retrieval model
- mutual information
- unsupervised learning
- evaluation metrics
- image classification
- training set
- retrieved documents
- dimensionality reduction