DocStormer: Revitalizing Multi-Degraded Colored Document Images to Pristine PDF.
Chaowei LiuJichun LiYihua TengChaoqun WangNuo XuJihao WuDandan TuPublished in: CoRR (2023)
Keyphrases
- document images
- ocr systems
- document image analysis
- document analysis
- optical character recognition
- document image understanding
- page segmentation
- probability density function
- printed documents
- image binarization
- page layout
- scanned documents
- historical documents
- word spotting
- scanned document images
- document processing
- hidden markov models
- scanned images
- expectation maximization
- metadata