Automatic information extraction from semi-structured Web pages by pattern discovery.
Chia-Hui ChangChun-Nan HsuShao-Chen LuiPublished in: Decis. Support Syst. (2003)
Keyphrases
- semi structured
- pattern discovery
- information extraction
- wrapper generation
- data extraction
- web documents
- web information extraction
- web pages
- web data
- structured data
- web data extraction
- html pages
- pattern mining
- web sources
- web data sources
- text mining
- recurring patterns
- natural language processing
- data analysis
- information integration
- free text
- recursive partitioning
- semi structured data
- data model
- search engine
- interesting patterns
- machine learning
- association rule mining
- information retrieval
- sequential patterns
- website
- named entities
- data mining
- structured knowledge
- html documents
- web databases
- motif discovery
- keywords
- semi automatic
- discovering patterns
- web mining
- natural language
- web content
- text documents
- artificial intelligence
- temporal data mining
- web users
- link analysis
- textual data
- graph mining
- unstructured data
- active learning