Classification of Skewed and Homogenous Document Corpora with Class-Based and Corpus-Based Keywords.
Arzucan ÖzgürTunga GüngörPublished in: KI (2006)
Keyphrases
- keywords
- document classification
- class labels
- text documents
- document corpus
- keyword extraction
- web documents
- multiclass classification
- multi class classification
- classification accuracy
- document collections
- keyword search
- machine learning
- text classification
- feature set
- decision trees
- support vector
- search engine
- information retrieval systems
- class specific
- feature extraction
- support vector machine
- user queries
- natural language processing
- cost sensitive
- training set
- relevant documents
- multi class
- similar documents
- feature selection