A Deep OCR for Degraded Bangla Documents.
Ayan ChaudhuryPartha Sarathi MukherjeeSudip DasChandan BiswasUjjwal BhattacharyaPublished in: ACM Trans. Asian Low Resour. Lang. Inf. Process. (2022)
Keyphrases
- ocr systems
- optical character recognition
- character segmentation
- document images
- printed documents
- document processing
- character recognition
- scanned documents
- word spotting
- text recognition
- handwriting recognition
- page layout
- document collections
- handwritten documents
- indian languages
- document analysis
- information retrieval
- document retrieval
- post processing
- keywords
- information retrieval systems
- metadata
- text documents
- xml documents
- machine vision
- historical documents
- text lines
- document image analysis
- recognition errors
- text extraction
- document classification
- relevant documents
- retrieval systems
- web documents
- outdoor images
- vector space model
- scene images
- error correction
- user queries
- n gram
- co occurrence
- training set
- web pages