Training text chunkers on a silver standard corpus: can silver replace gold?
Ning KangErik M. van MulligenJan A. KorsPublished in: BMC Bioinform. (2012)
Keyphrases
- training corpus
- test set
- broad coverage
- open domain
- text corpus
- supervised machine learning
- text collections
- author identification
- text retrieval
- document corpus
- plain text
- natural language text
- manually annotated
- topic segmentation
- textual features
- information extraction systems
- newspaper articles
- lexical features
- linguistic information
- recognizing textual entailment
- named entity disambiguation
- text corpora
- text processing
- text classifiers
- multiword
- textual data
- machine learning
- text data
- keywords
- training set
- text documents
- database
- text mining
- training examples
- anaphora resolution
- scientific papers
- free text
- training process
- noun phrases