Text Degradations and OCR Training.
Elisa H. Barney SmithTimothy L. AndersenPublished in: ICDAR (2005)
Keyphrases
- text recognition
- printed documents
- text extraction
- document processing
- document analysis
- training set
- optical character recognition
- ocr systems
- training algorithm
- document images
- training process
- supervised learning
- text retrieval
- post processing
- information retrieval
- text processing
- character recognition
- printed text
- digital libraries
- preprocessing
- image quality
- data sets
- free text
- keywords
- text information
- text mining
- training examples
- text documents
- database
- text analysis
- training phase
- textual data
- text lines
- language independent
- text detection
- hidden markov models
- key concepts
- document clustering
- neural network