Differentiated strategies for replicating Web documents.
Guillaume PierreIhor KuzMaarten van SteenAndrew S. TanenbaumPublished in: Comput. Commun. (2001)
Keyphrases
- web documents
- web pages
- keywords
- information extraction
- web search engines
- document classification
- prefetching
- semi structured
- textual information
- vector space model
- web data
- link structure
- focused crawling
- web mining
- html documents
- learning algorithm
- document representation
- web directories
- web content
- web logs
- databases
- active learning
- website
- machine learning