Labelling OCR Ground Truth for Usage in Repositories.
Matthias BoenigKonstantin BaiererVolker HartmannMaria FederbuschClemens NeudeckerPublished in: DATeCH (2019)
Keyphrases
- ground truth
- optical character recognition
- high quality
- digital libraries
- document images
- character recognition
- post processing
- gold standard
- learning objects
- preprocessing
- ground truth data
- manually labeled
- error correction
- metadata
- segmented images
- neural network
- data collections
- document processing
- text recognition
- recognition errors
- digital objects
- source code
- human subjects
- medical images
- web services
- web usage
- printed documents
- multimedia
- image processing
- real time