Optical character recognition errors and their effects on natural language processing.
Daniel P. LoprestiPublished in: AND (2008)
Keyphrases
- optical character recognition
- natural language processing
- character recognition
- document images
- character segmentation
- text recognition
- information extraction
- ocr systems
- knowledge representation
- machine learning
- page segmentation
- printed documents
- text processing
- image processing
- residual errors
- text mining
- machine vision
- handwriting recognition
- english text
- scanned documents
- wordnet
- printed text
- text summarization
- machine translation
- image enhancement
- co occurrence
- word spotting
- natural language
- historical manuscripts
- video sequences