Scalable Techniques for Clustering the Web.
Taher H. HaveliwalaAristides GionisPiotr IndykPublished in: WebDB (Informal Proceedings) (2000)
Keyphrases
- web scale
- clustering algorithm
- website
- parameter free
- k means
- web applications
- clustering method
- web pages
- web people search
- web objects
- information sources
- web resources
- hierarchical clustering
- web content
- document clustering
- web documents
- high dimensional data
- self organizing maps
- information theoretic
- cluster analysis
- data objects
- completely unsupervised
- web snippets
- web data
- categorical data
- web technologies
- user generated content
- data points
- tag information
- database
- end users
- fuzzy clustering
- unsupervised learning
- lightweight
- anomaly detection
- semantic web