OCR4all - Open-Source OCR and HTR Across the Centuries.
Florian LanghankiMaximilian WehnerTorsten RoederChristian ReulPublished in: DH (2023)
Keyphrases
- optical character recognition
- open source
- document images
- text recognition
- character recognition
- post processing
- recognition errors
- error correction
- printed documents
- source code
- document processing
- preprocessing
- document image analysis
- document image retrieval
- machine learning
- scanned documents
- character segmentation
- neural network
- page layout
- text localization and recognition
- database systems
- search engine
- data mining