Document clustering with cluster refinement and model selection capabilities.
Xin LiuYihong GongWei XuShenghuo ZhuPublished in: SIGIR (2002)
Keyphrases
- model selection
- document clustering
- clustering algorithm
- document clusters
- clustering quality
- cross validation
- clustering approaches
- cluster analysis
- text mining
- mixture model
- negative matrix factorization
- clustering method
- generalization error
- k means
- document corpus
- feature selection
- data clustering
- document collections
- selection criterion
- model selection criteria
- bisecting k means
- unsupervised learning
- meta learning
- text documents
- information criterion
- machine learning
- bayesian information criterion
- pairwise constraints
- constrained clustering
- information extraction
- data analysis
- data sets