Effect of small sample size on text categorization with support vector machines.
Pawel MatykiewiczJohn PestianPublished in: BioNLP@HLT-NAACL (2012)
Keyphrases
- text categorization
- small sample size
- feature selection
- sample size
- high dimensionality
- multi label
- text documents
- text classification
- face recognition
- microarray data
- knn
- linear discriminant analysis
- high dimensional
- automated text categorization
- information gain
- k nearest neighbor
- text classifiers
- tf idf
- high dimensional data
- feature selection for text categorization
- reuters corpus
- support vector machine
- semi supervised learning
- nearest neighbor
- feature selections
- feature space
- object recognition
- unlabeled data
- support vector
- maximum likelihood
- worst case