From HTML documents to web tables and rules.
Kai SimonGeorg LausenHarold BoleyPublished in: ICEC (2006)
Keyphrases
- html documents
- web documents
- web pages
- web content
- web information extraction
- html pages
- website
- automatic extraction
- semi structured
- web page retrieval
- semi structured data
- structured documents
- repeated patterns
- semantic information
- topic maps
- web data
- document representation
- web mining
- databases
- information extraction
- web search engines
- web search
- association rules
- keywords
- vector space model
- xml documents
- background knowledge
- semantic web
- semistructured data
- knowledge discovery
- database systems
- machine learning