Drilling a Large Corpus of Document Images of Geological Information Extraction.
Jean-Louis DebeziaMélodie BoilletChristopher KermorvantQuentin BarralPublished in: PKDD/ECML Workshops (2) (2021)
Keyphrases
- page segmentation
- document images
- information extraction
- open domain
- information extraction systems
- text mining
- natural language text
- optical character recognition
- document image analysis
- natural language processing
- document analysis
- named entities
- named entity recognition
- relation extraction
- printed documents
- document processing
- question answering
- document image understanding
- page layout
- image binarization
- scanned documents
- natural language
- information retrieval
- machine learning
- word level
- language identification
- textual data
- text lines
- word spotting
- sentence level
- historical documents
- text processing