On OCR ground truths and OCR post-correction gold standards, tools and formats.
Martin ReynaertPublished in: DATeCH (2014)
Keyphrases
- machine vision
- character recognition
- optical character recognition
- image processing
- error correction
- character segmentation
- printed documents
- text recognition
- document images
- post processing
- metadata
- ground truth
- preprocessing
- recognition errors
- end to end
- end users
- web services
- multimedia
- neural network
- document processing
- scanned documents
- page layout