OCR and post-correction of historical Finnish texts.
Senka DrobacPekka KauppinenKrister LindénPublished in: NODALIDA (2017)
Keyphrases
- error correction
- optical character recognition
- document images
- post processing
- historical data
- character recognition
- document processing
- natural language
- text recognition
- historical manuscripts
- error detection
- preprocessing
- legal texts
- natural language text
- neural network
- ocr systems
- recognition errors
- document image retrieval
- printed documents
- error analysis
- keywords
- character segmentation
- natural language generation
- language model
- search engine
- information retrieval