Globule: A Platform for Self-Replicating Web Documents.
Guillaume PierreMaarten van SteenPublished in: PROMS (2001)
Keyphrases
- web documents
- semi structured
- information extraction
- web search engines
- web pages
- keywords
- prefetching
- link structure
- document classification
- html documents
- vector space model
- document representation
- unstructured documents
- web directories
- topic specific
- web data
- web content
- data sources
- website
- web logs
- machine learning