Document Layout Analysis for Semantic Information Extraction.
Weronika T. AdrianNicola LeoneMarco MannaCinzia MartePublished in: AI*IA (2017)
Keyphrases
- information extraction
- unstructured documents
- web documents
- text documents
- semantic information
- text representation
- text mining
- information retrieval
- natural language
- natural language processing
- text summarization
- natural language text
- semantic tagging
- document classification
- named entity recognition
- precision and recall
- document clustering
- semi structured
- machine learning
- document content
- semantic knowledge
- structured data
- information retrieval systems
- document collections
- question answering
- high level
- web mining
- named entities
- semantic annotation
- retrieval systems
- linguistic patterns
- keywords
- tf idf
- semantic structure
- domain specific
- cross document
- relation extraction
- document images
- free text
- document retrieval
- text processing
- data model
- data extraction
- semantic features
- conditional random fields
- cf loadingtexthtml
- bag of words
- machine translation