Using linear regression residual of document vectors in text categorization.
Hakan AltinçayPublished in: SIU (2013)
Keyphrases
- text categorization
- linear regression
- document classification
- text documents
- term frequency
- text classifiers
- least squares
- training documents
- tf idf
- automatic categorization
- text collections
- text classification
- automatic text categorization
- document categorization
- document frequency
- knn
- multi label
- feature selection
- k nearest neighbor
- term weighting
- naive bayes
- information gain
- classify documents
- semi supervised learning
- machine learning
- information retrieval
- document representation
- document collections
- document clustering
- document retrieval
- feature vectors
- information retrieval systems
- vector space
- keywords
- decision trees
- dimensionality reduction
- relevant documents
- web documents