Clustering Documents with Large Overlap of Terms into Different Clusters based on Similarity Rough Set Model.
Nguyen Chi ThanhKoichi YamadaMuneyuki UneharaPublished in: KDIR (2010)
Keyphrases
- document clustering
- clustering algorithm
- overlapping clusters
- document clusters
- cluster labels
- cosine similarity
- intra cluster
- hierarchical clustering
- clustering quality
- document collections
- cluster analysis
- similar objects
- similarity scores
- similarity matrix
- inter cluster
- related documents
- rough set model
- data points
- document corpus
- information retrieval
- clustering method
- similarity function
- k means
- semantic similarity
- similarity measure
- rule induction
- text documents
- decision rules
- distance function
- decision making
- spectral clustering
- background knowledge
- distance measure
- pattern recognition
- real world