Accurate SVM Text Classification for Highly Skewed Data Using Threshold Tuning and Query-Expansion-Based Feature Selection.
Ben GoertzelJames VenutoPublished in: IJCNN (2006)
Keyphrases
- feature selection
- text classification
- query expansion
- data sets
- highly skewed
- data points
- text categorization
- machine learning
- original data
- high dimensional data
- support vector
- training data
- language model
- information retrieval
- support vector machine svm
- information retrieval systems
- labeled data
- feature set
- naive bayes
- natural language processing
- test data
- relevant documents
- document retrieval
- query terms
- pseudo relevance feedback
- text classifiers
- feature space