Correcting OCR text by association with historical datasets.
Susan E. HauserJonathan SchlaiferTehseen F. SabirDina Demner-FushmanScott StraughanGeorge R. ThomaPublished in: DRR (2003)
Keyphrases
- text recognition
- optical character recognition
- document processing
- printed documents
- text extraction
- historical manuscripts
- document analysis
- document images
- text data
- information retrieval
- database
- character recognition
- text mining
- ocr systems
- historical documents
- keywords
- post processing
- text retrieval
- text documents
- historical data
- free text
- scanned documents
- text collections
- textual data
- string matching
- preprocessing
- raw data
- page layout
- printed text
- error correction