Efficiently Clustering Documents with Committees.
Patrick PantelDekang LinPublished in: PRICAI (2002)
Keyphrases
- document clustering
- document collections
- clustering algorithm
- text clustering
- information retrieval
- k means
- clustering method
- categorical data
- information retrieval systems
- cosine similarity
- relevant documents
- document retrieval
- legal documents
- web documents
- document classification
- metadata
- keywords
- hierarchical clustering
- spectral clustering
- topic discovery
- search engine
- query terms
- text documents
- information theoretic
- retrieval systems
- self organizing maps
- xml documents
- user queries
- semantic information
- instance level constraints
- mutual reinforcement
- semi supervised
- topic detection
- unsupervised learning
- document set
- test collection
- vector space model
- fuzzy clustering
- data objects
- data clustering