Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge.
Evgeniy GabrilovichShaul MarkovitchPublished in: AAAI (2006)
Keyphrases
- text categorization
- text classification
- feature selection
- multi label
- knn
- information gain
- k nearest neighbor
- text documents
- document categorization
- feature weighting
- automated text categorization
- naive bayes
- feature selection for text categorization
- text classifiers
- document classification
- automatic text categorization
- term weighting
- reuters corpus
- text collections
- unlabeled data
- semi supervised learning
- document frequency
- decision trees
- tf idf
- feature extraction
- distributional clustering
- feature selections
- data sets