Non-standard words as features for text categorization.
Slobodan BeligaSanda Martincic-IpsicPublished in: MIPRO (2014)
Keyphrases
- text categorization
- training documents
- text documents
- feature generation
- feature weighting
- information gain
- text classification
- knn
- distributional clustering
- feature selection
- feature reduction
- k nearest neighbor
- linear svm
- multi label
- semi supervised learning
- naive bayes
- word sense disambiguation
- feature set
- automated text categorization
- feature vectors
- feature extraction
- word frequency
- automatic text categorization
- feature selections
- document frequency
- information theoretic
- n gram
- co occurrence
- term weighting
- text collections
- reuters corpus
- support vector machine
- training data