Semantic PDF Segmentation for Legacy Documents in Technical Documentation.
Jan OevermannPublished in: SEMANTiCS (2018)
Keyphrases
- semantic information
- semantic relationships
- topic segmentation
- linguistic analysis
- multiscale
- level set
- image segmentation
- pdf files
- legacy systems
- segmentation algorithm
- segmentation method
- semantic content
- semantic relevance
- document centric
- semantically related
- information retrieval
- information retrieval systems
- medical images
- document collections
- document retrieval
- probability density function
- natural language
- unstructured documents
- pdf documents
- semantic similarity
- document content
- metadata
- keywords
- xml documents
- web documents
- text documents
- relevant documents
- natural language text
- written in natural language
- co occurrence
- domain ontology
- document clustering
- semantic structure
- reverse engineering
- semantic classes
- probability distribution function
- semantic network
- multi document summarization
- semantic features