A Fast Chinese Web-document Clustering Method under Pareto's Principle.
Tianlei ZhangGuisheng ChenHao ChePublished in: GrC (2008)
Keyphrases
- clustering method
- web documents
- multi objective
- web pages
- clustering algorithm
- affinity propagation
- information extraction
- hierarchical clustering
- fuzzy c means
- relational clustering
- keywords
- cluster analysis
- vector space model
- prefetching
- clustering framework
- subspace clustering
- document clustering
- spectral clustering
- web content
- document representation
- textual information
- clustering analysis
- k means
- unstructured documents
- web logs
- constrained clustering
- unsupervised clustering
- hierarchical agglomerative clustering
- clustering approaches
- dissimilarity measure
- knowledge discovery
- similarity measure
- classical clustering algorithms