Creating a Gold Standard Corpus for the Extraction of Chemistry-Disease Relations from Patent Texts.
Antje SchlafClaudia BobachMatthias IrmerPublished in: LREC (2014)
Keyphrases
- gold standard
- genia corpus
- text mining
- ground truth
- semi automatic
- named entities
- information extraction
- information extraction systems
- text segments
- newspaper articles
- natural language text
- manual segmentation
- key phrase extraction
- manually annotated
- training corpus
- linguistic patterns
- automatic extraction
- rhetorical structure theory
- multiword
- information retrieval
- text corpus
- linguistic analysis
- patent retrieval
- text documents
- mechanical turk
- entity extraction
- linguistic features
- world knowledge
- citation networks
- scientific papers
- english words
- linguistic information
- registration accuracy
- semantic relations
- topic models
- co occurrence
- natural language processing
- natural language