Multi-page document analysis based on format consistency and clustering.
Liangcai GaoZhi TangJing FangXiaofan LinPublished in: Int. J. Comput. Appl. Technol. (2010)
Keyphrases
- document analysis
- document images
- electronic documents
- character recognition
- image analysis
- clustering algorithm
- website
- document image analysis
- k means
- word level
- text analysis
- word recognition
- pattern recognition
- document processing
- metadata
- machine vision
- web pages
- knowledge base
- video analysis
- databases
- word segmentation
- printed documents
- line fitting
- image classification
- data points
- data structure
- multimedia
- neural network