An alternative, layout-driven approach to the clustering of documents.
Vincenzo LoiaSabrina SenatorePublished in: Int. J. Intell. Syst. (2008)
Keyphrases
- document clustering
- text clustering
- clustering algorithm
- clustering method
- k means
- information retrieval
- page layout
- xml documents
- cosine similarity
- document collections
- document retrieval
- metadata
- document image retrieval
- document classification
- retrieval systems
- information retrieval systems
- cluster analysis
- web documents
- hierarchical clustering
- fuzzy clustering
- unsupervised learning
- categorical data
- topic detection
- clustering quality
- document representation
- data points
- text mining
- mutual reinforcement
- topic discovery
- cluster labels
- data driven
- text collections
- spectral clustering
- data clustering
- vector space
- self organizing maps
- text documents