Extracting unstructured data from template generated web documents.
Ling MaNazli GoharianAbdur ChowdhuryMisun ChungPublished in: CIKM (2003)
Keyphrases
- web documents
- unstructured data
- semi structured
- structured data
- information extraction
- unstructured text
- textual data
- textual information
- big data
- semi structured data
- web data
- keywords
- web search engines
- html documents
- web pages
- semistructured data
- wrapper induction
- data sources
- information management
- data warehouse
- data sets
- raw data
- relational databases
- search engine
- data mining
- pattern matching
- matching algorithm
- high dimensional
- databases