Experiments in high-dimensional text categorization.
Fred DamerauTong ZhangSholom M. WeissNitin IndurkhyaPublished in: SIGIR (2002)
Keyphrases
- text categorization
- high dimensional
- text classification
- knn
- feature selection
- low dimensional
- k nearest neighbor
- multi label
- nearest neighbor
- information gain
- dimensionality reduction
- data points
- text documents
- naive bayes
- high dimensional data
- similarity search
- reuters corpus
- text classifiers
- feature space
- feature weighting
- automatic text categorization
- semantic browsing
- automated text categorization
- text collections
- tf idf
- bag of words
- term frequency
- document categorization
- document frequency
- training documents
- support vector
- term weighting
- unlabeled data
- semi supervised learning
- image classification