Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit.
Amrith KrishnaBodhisattwa Prasad MajumderRajesh Shreedhar BhatPawan GoyalPublished in: CoRR (2018)
Keyphrases
- document images
- optical character recognition
- text recognition
- ocr systems
- printed documents
- document processing
- document analysis
- page layout
- error correction
- scanned documents
- text extraction
- text lines
- post processing
- character recognition
- recognition errors
- preprocessing
- historical documents
- document image analysis
- text regions
- scanned images
- printed text
- machine translation
- handwriting recognition
- document image retrieval
- viterbi algorithm
- free text
- hidden markov models
- information retrieval
- character segmentation
- database
- text processing
- textual data
- text retrieval
- text mining