A Hybrid Geometric Approach for Measuring Similarity Level Among Documents and Document Clustering.
Arash HeidarianMichael J. DinneenPublished in: BigDataService (2016)
Keyphrases
- document clustering
- measuring similarity
- document collections
- document representation
- document similarity
- text documents
- similarity measure
- vector space model
- document clusters
- text mining
- clustering method
- similar documents
- topic extraction
- clustering algorithm
- automatic categorization
- k means
- cluster analysis
- language model
- tolerance rough set
- information retrieval systems
- principal component analysis
- data analysis
- metadata