DEFT: A corpus for definition extraction in free- and semi-structured text.
Sasha SpalaNicholas A. MillerYiming YangFranck DernoncourtCarl DockhornPublished in: LAW@ACL (2019)
Keyphrases
- semi structured
- information extraction
- free text
- information extraction systems
- text mining
- data extraction
- structured data
- web documents
- entity extraction
- text data
- linguistic patterns
- natural language text
- unstructured text
- web information extraction
- text documents
- content and structure
- web data extraction
- text corpora
- information integration
- textual data
- wrapper generation
- extraction patterns
- natural language processing
- information retrieval
- semi structured data
- web data
- data model
- semi structured documents
- text corpus
- html pages
- relation extraction
- manually annotated
- web data sources
- machine learning
- unstructured data
- sentence level
- structured knowledge
- named entities
- database
- automatic extraction
- knowledge rich
- knowledge discovery
- website
- wrapper induction
- document level
- data collections
- web sources
- database systems
- artificial intelligence