Clustering document images using a bag of symbols representation.
Eugen BarbuPierre HérouxSébastien AdamÉric TrupinPublished in: ICDAR (2005)
Keyphrases
- document images
- document image analysis
- document analysis
- mathematical formulas
- document image understanding
- clustering algorithm
- clustering method
- optical character recognition
- historical documents
- scanned documents
- k means
- printed documents
- scanned document images
- metadata
- page layout
- page segmentation
- mathematical expressions
- document processing
- bag of words
- image representation
- indian languages
- word spotting
- line extraction
- image processing