Clustering and Labeling a Web Scale Document Collection using Wikipedia clusters.
Richi NayakRachel MillsChristopher M. De VriesShlomo GevaPublished in: Web-KRM@CIKM (2014)
Keyphrases
- document collections
- web scale
- document clustering
- document clusters
- clustering algorithm
- information retrieval systems
- document retrieval
- cluster analysis
- information retrieval
- test collection
- text retrieval
- clustering method
- k means
- similar documents
- digital libraries
- data points
- unsupervised learning
- semi structured
- relevant documents
- web images
- document set
- text mining
- document archives
- text documents
- image search
- data sources
- database
- related documents
- image segmentation