Keyphrases
- text recognition
- printed documents
- recognition errors
- document processing
- optical character recognition
- text extraction
- character recognition
- ocr systems
- information retrieval
- document analysis
- error correction
- document images
- string matching
- page layout
- mathematical expressions
- text retrieval
- web documents
- text mining
- preprocessing
- digital libraries
- text data
- textual data
- automatically extracted
- scanned images
- scanned documents
- text lines
- error detection
- text information
- complex background
- text detection
- viterbi algorithm
- natural language generation
- database
- text summarization
- end to end
- text documents
- post processing
- information extraction
- data sets