Combining Language and Graph Models for Semi-structured Information Extraction on the Web.
Zhi HongKyle ChardIan T. FosterPublished in: CoRR (2024)
Keyphrases
- semi structured
- information extraction
- web documents
- data extraction
- web data
- web sources
- web scale
- semi structured data
- structured data
- web information extraction
- web data sources
- textual data
- natural language
- web mining
- free text
- information integration
- wrapper generation
- website
- web data extraction
- data model
- unstructured data
- natural language processing
- semi structured documents
- content and structure
- unstructured text
- html pages
- text mining
- data collections
- search interface
- information retrieval
- web pages
- programming language
- database
- machine learning
- web databases
- web users
- web content
- named entities
- information sources
- web information
- machine translation
- knowledge rich
- semantic web
- linked data
- text documents
- databases
- wrapper induction