Ancient documents bleed-through evaluation and its application for predicting OCR error rates.
Vincent RabeuxNicholas JournetJean-Philippe DomengerPublished in: DRR (2011)
Keyphrases
- document images
- document processing
- ocr systems
- printed documents
- document analysis
- scanned documents
- error rate
- optical character recognition
- page layout
- post processing
- information retrieval
- text documents
- information retrieval systems
- document image retrieval
- historical documents
- summary generation
- document classification
- evaluation method
- digital libraries
- document clustering
- document collections
- database
- keywords
- xml documents
- legal documents
- relevance judgments
- error bounds
- document image analysis
- retrieval systems
- web documents
- character recognition
- handwriting recognition
- interactive retrieval
- scanned images
- document retrieval
- preprocessing
- historical manuscripts
- text retrieval