Verifying a Chinese collection for text categorization.
Yuen-Hsien TsengWilliam John TeahanPublished in: SIGIR (2004)
Keyphrases
- text categorization
- text collections
- automatic categorization
- text classification
- multi label
- feature selection
- semi supervised learning
- reuters corpus
- text documents
- automated text categorization
- information gain
- naive bayes
- text classifiers
- knn
- document collections
- feature weighting
- term frequency
- text summarization
- document categorization
- neural network
- multi instance multi label learning
- tf idf
- term weighting
- unlabeled data
- k nearest neighbor
- document frequency
- automatic text categorization
- semi supervised
- prior knowledge
- feature selection for text categorization