EGA: An Algorithm for Automatic Semi-structured Web Documents Extraction.
Liyu LiShiwei TangDongqing YangTengjiao WangZhihua SuPublished in: DASFAA (2004)
Keyphrases
- semi structured
- web documents
- information extraction
- wrapper generation
- learning algorithm
- tree structured patterns
- wrapper induction
- data model
- text mining
- semistructured documents
- database
- web data extraction
- html documents
- unstructured data
- information integration
- knowledge representation
- website
- web pages
- web data
- web search engines
- web information extraction
- web data sources
- artificial intelligence
- machine learning