Rule-based word clustering for text classification.
Hui HanEren ManavogluC. Lee GilesHongyuan ZhaPublished in: SIGIR (2003)
Keyphrases
- text classification
- distributional clustering
- n gram
- unsupervised learning
- clustering algorithm
- training corpus
- text categorization
- clustering method
- bag of words
- k means
- information theoretic
- feature selection
- word segmentation
- term frequency
- self organizing maps
- topic discovery
- cluster analysis
- co occurrence
- text mining
- data cleaning
- text data
- data clustering
- labeled data
- outlier detection
- naive bayes
- sentiment analysis
- document clustering
- high dimensional data
- word sense disambiguation
- text documents
- data points
- text classifiers
- multi label
- keywords
- probabilistic model