Non-Standard Words as Features for Text Categorization.
Slobodan BeligaSanda Martincic-IpsicPublished in: CoRR (2014)
Keyphrases
- text categorization
- feature generation
- feature weighting
- text documents
- training documents
- information gain
- feature selection
- text classification
- feature reduction
- knn
- multi label
- distributional clustering
- text classifiers
- word frequency
- document frequency
- k nearest neighbor
- feature extraction
- linear svm
- naive bayes
- tf idf
- reuters corpus
- semi supervised learning
- automatic text categorization
- word sense disambiguation
- text collections
- co occurrence
- automated text categorization
- feature vectors
- information retrieval
- feature selection for text categorization
- term frequency
- n gram
- feature set
- information extraction
- learning algorithm