Mass Digitization of Early Modern Texts With Optical Character Recognition.
Matthew ChristyAnshul GuptaElizabeth GrumbachLaura MandellRichard FurutaRicardo Gutierrez-OsunaPublished in: ACM Journal on Computing and Cultural Heritage (2018)
Keyphrases
- optical character recognition
- character recognition
- text recognition
- document images
- ocr systems
- character segmentation
- handwriting recognition
- page segmentation
- scanned documents
- printed documents
- text segmentation
- image binarization
- word spotting
- digital libraries
- natural language generation
- binary images
- keywords
- hand written
- information retrieval
- image analysis
- text extraction
- historical manuscripts
- image processing