Snorkel: Fast Training Set Generation for Information Extraction.
Alexander J. RatnerStephen H. BachHenry R. EhrenbergChristopher RéPublished in: SIGMOD Conference (2017)
Keyphrases
- information extraction
- training set
- test set
- training data
- machine learning
- precision and recall
- text mining
- natural language processing
- natural language
- nearest neighbor
- named entity recognition
- text processing
- data sets
- classification accuracy
- classification algorithm
- training samples
- training examples
- semi structured
- free text
- textual data
- question answering
- semantic tagging
- supervised learning
- feature space
- active learning
- ontology based information extraction
- web mining
- information retrieval
- text documents
- named entities
- class labels
- structured data
- text summarization
- generation process
- web documents
- support vector machine