High-performing feature selection for text classification.
Monica RogatiYiming YangPublished in: CIKM (2002)
Keyphrases
- text classification
- feature selection
- text categorization
- bag of words
- web page classification
- machine learning
- text mining
- text classifiers
- n gram
- naive bayes
- unlabeled data
- text documents
- wide range
- knn
- feature engineering
- labeled data
- classification accuracy
- sentiment analysis
- mutual information
- support vector
- text classification tasks
- training data
- multi label
- unsupervised learning
- model selection
- multi class
- feature space
- high dimensionality
- association rules
- feature subset
- feature selection algorithms
- semantic features
- support vector machine
- discriminative features