Early modern OCR project (eMOP) at Texas A&M University: using Aletheia to train Tesseract.
Katayoun TorabiJessica DurganBryan TarpleyPublished in: ACM Symposium on Document Engineering (2013)
Keyphrases
- texas a m university
- post processing
- project management
- optical character recognition
- character recognition
- future plans
- case study
- preprocessing
- software development
- neural network
- nsf funded
- recognition errors
- text recognition
- error correction
- document images
- database
- decision trees
- image processing
- decision making
- data sets