Semantic feature reduction in chinese document clustering.
Xianjun MengQingcai ChenXiaolong WangPublished in: SMC (2008)
Keyphrases
- document clustering
- feature reduction
- text documents
- text mining
- rough sets
- document collections
- text categorization
- clustering method
- clustering algorithm
- natural language
- k means
- random forest
- reduction method
- feature set
- text classification
- information retrieval systems
- cluster analysis
- data sets
- support vector machine
- feature selection
- viterbi algorithm
- pattern recognition
- knn
- classification accuracy
- named entities
- image processing
- probabilistic model
- hidden markov models
- pairwise
- search engine