Unsupervised information extraction from unstructured, ungrammatical data sources on the World Wide Web.
Matthew MichelsonCraig A. KnoblockPublished in: Int. J. Document Anal. Recognit. (2007)
Keyphrases
- data sources
- information extraction
- structured data
- semi structured
- unstructured text
- unstructured documents
- free text
- structured and unstructured data
- unstructured data
- data collections
- text mining
- data model
- databases
- machine learning
- precision and recall
- semi supervised
- data integration
- natural language processing
- information retrieval
- data extraction
- integrating heterogeneous
- relation extraction
- named entity recognition
- question answering
- web documents
- natural language
- data driven
- supervised learning
- data warehouse
- ontology based information extraction
- named entities
- heterogeneous data sources
- open domain
- disparate data sources
- conditional random fields
- database
- unsupervised learning
- data sets
- geospatial data
- heterogeneous data
- word sense disambiguation
- text documents
- knowledge discovery
- web pages