Reading in the mist: high-quality optical character recognition based on freely available early modern digitized books.
Andrea SangiacomoHugo Dirk HogenbirkRaluca A. TanasescuAntonia KaraislNick WhitePublished in: Digit. Scholarsh. Humanit. (2022)
Keyphrases
- historical manuscripts
- optical character recognition
- high quality
- character recognition
- document images
- ocr systems
- text recognition
- word spotting
- character segmentation
- historical documents
- handwriting recognition
- low quality
- image binarization
- printed documents
- electronic books
- page segmentation
- scanned documents
- english text
- web pages