Structural Characterization of Popular Web Documents.
Abdolreza AbhariSivarama P. DandamudiShikharesh MajumdarPublished in: Int. J. Comput. Their Appl. (2002)
Keyphrases
- web documents
- information extraction
- semi structured
- web search engines
- web pages
- keywords
- structural information
- focused crawling
- document classification
- html documents
- document representation
- unstructured documents
- textual information
- vector space model
- topic specific
- web logs
- structured documents
- language model
- geographic information
- information retrieval systems
- content similarity
- data mining