Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit.
Amrith KrishnaBodhisattwa Prasad MajumderRajesh Shreedhar BhatPawan GoyalPublished in: CoNLL (2018)
Keyphrases
- document images
- optical character recognition
- text recognition
- printed documents
- ocr systems
- document processing
- page layout
- error correction
- document analysis
- scanned documents
- character recognition
- document image analysis
- printed text
- text extraction
- text lines
- scanned images
- recognition errors
- post processing
- handwriting recognition
- text regions
- historical documents
- preprocessing
- database
- viterbi algorithm
- information retrieval
- handwritten documents
- text mining
- finite state automata
- document image retrieval
- character segmentation
- multimedia documents
- text processing
- free text
- text retrieval
- hidden markov models
- machine learning