Fast and Intuitive Clustering of Web Documents.
Oren ZamirOren EtzioniOmid MadaniRichard M. KarpPublished in: KDD (1997)
Keyphrases
- web documents
- content similarity
- information extraction
- semi structured
- clustering algorithm
- web pages
- k means
- keywords
- document classification
- clustering method
- vector space model
- document representation
- web search engines
- web content
- web logs
- prefetching
- web data
- link structure
- html documents
- document clustering
- textual information
- unstructured documents
- focused crawling
- returned by a search engine
- information retrieval
- geographic information
- structured documents
- relational databases
- learning algorithm