A scaleable document clustering approach for large document corpora.
Niall RooneyDavid W. PattersonMykola GalushkaVladimir DobryninPublished in: Inf. Process. Manag. (2006)
Keyphrases
- document clustering
- document corpus
- document collections
- document similarity
- text mining
- text documents
- document representation
- document clusters
- text clustering
- topic extraction
- clustering algorithm
- clustering method
- tf idf
- tolerance rough set
- natural language processing
- vector space model
- topic detection
- document set
- text collections
- text data
- web documents
- similar documents
- k means
- test collection
- image retrieval
- data mining