Handwritten Information Extraction from Historical Census Documents.
Thibauld NionFarès MenasriJérôme LouradourCédric SibadeThomas RetornazPierre-Yves MetaireauChristopher KermorvantPublished in: ICDAR (2013)
Keyphrases
- information extraction
- historical documents
- word spotting
- free text
- text documents
- historical manuscripts
- web documents
- information retrieval
- unstructured documents
- handwriting recognition
- document analysis
- text processing
- document collections
- text mining
- natural language processing
- unstructured text
- handwritten text
- text analysis
- textual data
- document images
- natural language text
- handwritten documents
- machine learning
- word recognition
- document classification
- character recognition
- named entities
- semi structured
- information retrieval systems
- united states
- precision and recall
- text line segmentation
- handwritten document images
- census data
- structured data
- relevant documents
- relation extraction
- conditional random fields
- machine translation
- named entity recognition
- web mining
- information extraction systems
- xml documents
- numeral strings
- word sense disambiguation
- document retrieval
- arabic documents
- digital libraries
- probabilistic model
- vector space model
- text classification
- newspaper articles
- document image analysis
- semantic information