High-speed rough clustering for very large document collections.
Kazuaki KishidaPublished in: J. Assoc. Inf. Sci. Technol. (2010)
Keyphrases
- document collections
- high speed
- document clustering
- topic detection
- information retrieval systems
- document retrieval
- document clusters
- information retrieval
- text retrieval
- clustering algorithm
- test collection
- index terms
- clustering method
- text clustering
- digital libraries
- ad hoc retrieval
- k means
- cross language
- data collections
- search engine
- text collections
- document representation
- vector space model
- topic extraction
- scatter gather
- text corpora
- text classification
- text mining
- data points
- document archives