Generating a Common Index for Multi-Authored Web Documents.
Erich WeichselgartnerMarc André SeligPublished in: WebNet (1998)
Keyphrases
- web documents
- information extraction
- web pages
- semi structured
- web search engines
- document classification
- keywords
- web content
- web data
- database
- focused crawling
- textual information
- machine learning
- topic specific
- html documents
- vector space model
- index structure
- information retrieval
- social annotations
- tree structured patterns