DocBed: A Multi-Stage OCR Solution for Documents with Complex Layouts.
Wenzhen ZhuNegin SokhandanGuang YangSujitha MartinSuchitra SathyanarayanaPublished in: AAAI (2022)
Keyphrases
- multistage
- stochastic programming
- single stage
- production system
- document processing
- stochastic optimization
- lot sizing
- dynamic programming
- document analysis
- metadata
- printed documents
- information retrieval
- assembly systems
- scanned documents
- character recognition
- optimal solution
- document images
- attack detection
- production line
- optical character recognition
- document collections
- web documents
- linear programming
- information retrieval systems
- xml documents
- post processing
- keywords
- document clustering
- relevant documents