Extracting Instances of Relations from Web Documents Using Redundancy.
Viktor de BoerMaarten van SomerenBob J. WielingaPublished in: ESWC (2006)
Keyphrases
- web documents
- web pages
- semi structured
- information extraction
- keywords
- web search engines
- data extraction
- document classification
- web data
- link structure
- web content
- databases
- vector space model
- domain specific
- data model
- document representation
- textual information
- search engine
- database
- topic specific
- focused crawling
- unstructured documents