Performing information extraction to improve OCR error detection in semi-structured historical documents.
Thomas L. PackerPublished in: HIP@ICDAR (2011)
Keyphrases
- semi structured
- information extraction
- error detection
- error correction
- structured data
- historical documents
- free text
- data extraction
- text mining
- web documents
- document images
- natural language processing
- data model
- handwriting recognition
- wrapper generation
- structured knowledge
- machine translation
- character recognition
- textual data
- optical character recognition
- artificial intelligence
- machine learning
- historical manuscripts
- data integration
- image analysis
- information retrieval