Using symbolic objects to cluster web documents.
Esteban MenesesOldemar Rodríguez-RojasPublished in: WWW (2006)
Keyphrases
- web documents
- data objects
- information extraction
- semi structured
- web pages
- content similarity
- similar objects
- web search engines
- html documents
- vector space model
- document classification
- prefetching
- returned by a search engine
- web data
- web content
- d objects
- high level
- topic specific
- data mining
- natural language processing
- data model
- textual information
- link structure
- clustering algorithm
- website
- search engine