DocBed: A Multi-Stage OCR Solution for Documents with Complex Layouts.
Wenzhen ZhuNegin SokhandanGuang YangSujitha MartinSuchitra SathyanarayanaPublished in: CoRR (2022)
Keyphrases
- multistage
- stochastic programming
- document processing
- production system
- single stage
- information retrieval
- dynamic programming
- printed documents
- web documents
- document analysis
- document collections
- information retrieval systems
- stochastic optimization
- lot sizing
- optical character recognition
- post processing
- scanned documents
- assembly systems
- page layout
- document retrieval
- text documents
- document images
- optimal policy
- xml documents
- metadata
- character recognition
- optimal solution
- attack detection
- keywords