From Pixels To True XML Structures In Digital Document Images.
Matthew Y. MaJinhong Katherine GuoPatrick Shen-Pei WangPublished in: Int. J. Pattern Recognit. Artif. Intell. (2004)
Keyphrases
- document images
- document image analysis
- document analysis
- text lines
- xml documents
- document image understanding
- printed documents
- optical character recognition
- language identification
- document processing
- digital libraries
- historical documents
- page segmentation
- scanned documents
- input image
- metadata
- image binarization
- document layout
- scanned document images
- handwritten documents
- speech recognition
- document collections
- line extraction
- relational databases