Weighted Document Frequency for feature selection in text classification.
Baoli LiQiuling YanZhenqiang XuGuicai WangPublished in: IALP (2015)
Keyphrases
- text classification
- document frequency
- feature selection
- text categorization
- term frequency
- information gain
- n gram
- bag of words
- naive bayes
- labeled data
- machine learning
- mutual information
- text documents
- text mining
- term weighting
- knn
- support vector
- k nearest neighbor
- text data
- feature extraction
- unlabeled data
- retrieved documents
- language modeling
- neural network
- feature set
- document representation
- feature space
- classification accuracy
- information extraction
- unsupervised learning
- dimensionality reduction