String Vectors as a Representation of Documents with Numerical Vectors in Text Categorization.
Taeho JoMalrey LeeYigon KimPublished in: J. Convergence Inf. Technol. (2007)
Keyphrases
- text categorization
- text documents
- document classification
- automatic text categorization
- automatic categorization
- document categorization
- text classification
- vector space
- training documents
- text collections
- text classifiers
- feature selection
- text representation
- term frequency
- multi label
- information gain
- term selection
- distributional clustering
- knn
- document collections
- text data
- semi supervised learning
- k nearest neighbor
- reuters corpus
- classify documents
- naive bayes
- tf idf
- unlabeled data
- term weighting
- information retrieval systems
- learning algorithm
- machine learning
- document retrieval
- active learning
- nearest neighbor
- document frequency
- language model
- vector space model