Checking Semantic Integrity Constraints on Integrated Web Documents.
Franz WeitlBurkhard FreitagPublished in: ER (Workshops) (2004)
Keyphrases
- web documents
- semantic integrity constraints
- web pages
- semi structured
- information extraction
- data model
- web search engines
- document classification
- html documents
- focused crawling
- keywords
- web data
- document representation
- link structure
- vector space model
- textual information
- unstructured documents
- web mining
- web content
- object oriented
- social annotations
- information retrieval