Detection Masking for Improved OCR on Noisy Documents.
Daniel RotmanOphir AzulaiInbar ShapiraYevgeny BurshteinUdi BarzelayPublished in: CoRR (2022)
Keyphrases
- document processing
- printed documents
- scanned documents
- document analysis
- text lines
- detection method
- information retrieval
- character recognition
- page layout
- detection algorithm
- anomaly detection
- low signal to noise ratio
- xml documents
- optical character recognition
- web documents
- document images
- text detection
- keywords
- automatic detection
- noisy environments
- scanned images
- database
- document image retrieval
- false alarms
- text documents
- object detection
- false positives
- detection rate
- free text
- document classification
- document clustering
- handwriting recognition
- text information
- vector space model
- relevant documents
- retrieval systems
- test collection
- word spotting
- post processing
- information retrieval systems
- error correction