Characterizing Documents for Information Extraction.
Jeffrey HudackMark ZappavignaWarren GeilerMichael CoreyPublished in: IC-AI (2008)
Keyphrases
- information extraction
- free text
- text documents
- web documents
- information retrieval
- unstructured documents
- text mining
- textual data
- text analysis
- natural language text
- natural language processing
- named entities
- document classification
- information retrieval systems
- document clustering
- structured data
- named entity recognition
- legal documents
- unstructured text
- natural language
- semi structured
- precision and recall
- machine learning
- information extraction systems
- relational learning
- document retrieval
- document analysis
- metadata
- document collections
- text categorization
- knowledge discovery
- ontology based information extraction
- vector space model
- document representation
- text processing
- text classification
- xml documents
- probabilistic model
- electronic documents
- language model
- multi document summarization
- database
- question answering
- textual information
- retrieval systems