A Corpus for Comparative Evaluation of OCR Software and Postcorrection Techniques.
Stoyan MihovKlaus U. SchulzChristoph RinglstetterVeselka DojchinovaVanja NakovaPublished in: ICDAR (2005)
Keyphrases
- comparative evaluation
- page segmentation
- post processing
- software systems
- scoring methods
- software development
- preprocessing
- source code
- user interface
- software architecture
- character recognition
- optical character recognition
- document images
- page layout
- scanned images
- error correction
- information retrieval
- nearest neighbor
- feature selection