Hybrid Clustering Approach for Term Partitioning in Document Data Sets.
K. Thammi ReddyM. ShashiL. Pratap ReddyPublished in: J. Digit. Inf. Manag. (2008)
Keyphrases
- data sets
- document clustering
- document representation
- cosine similarity
- k means
- clustering method
- text clustering
- clustering algorithm
- clustering scheme
- tolerance rough set
- information retrieval systems
- cluster membership
- document images
- term frequency
- keywords
- text categorization
- document collections
- mixed data
- partitioning algorithm
- high dimensional data sets
- graph partitioning
- categorical data
- spectral clustering
- data clustering
- data points
- text documents
- hierarchical clustering
- training data
- document space
- bag of words
- information retrieval
- web documents
- retrieval systems
- simultaneous clustering
- term dependence
- validity indices
- clustering analysis
- tf idf
- database
- cluster structure
- document identifiers
- inverted lists
- relevant documents
- index terms
- similarity measure
- vector space model
- fuzzy clustering