Regularizing Text Categorization with Clusters of Words.
Konstantinos SkianisFrançois RousseauMichalis VazirgiannisPublished in: EMNLP (2016)
Keyphrases
- text categorization
- text documents
- distributional clustering
- training documents
- text classification
- feature selection
- word frequency
- clustering algorithm
- document frequency
- document clustering
- multi label
- reuters corpus
- knn
- n gram
- automated text categorization
- semi supervised learning
- k nearest neighbor
- keywords
- automatic text categorization
- term weighting
- text collections
- information gain
- text classifiers
- naive bayes
- data points
- clustering method
- document categorization
- information theoretic
- text mining
- multi instance multi label learning
- unlabeled data
- tf idf
- nearest neighbor
- pairwise
- decision trees