Clustering OCR-ed texts for browsing document image database.
Koji TsudaShuji SendaMichihiko MinohKatsuo IkedaPublished in: ICDAR (1995)
Keyphrases
- image database
- document images
- document clustering
- document processing
- image data
- image content
- text documents
- content based retrieval
- printed documents
- scanned documents
- document analysis
- keywords
- image retrieval
- region based image retrieval
- clustering algorithm
- information retrieval systems
- optical character recognition
- k means
- image indexing
- document collections
- information retrieval
- content based image
- retrieval systems
- multimedia
- color histogram
- database
- metadata
- retrieval precision
- text lines
- high level
- image collections
- multiscale
- digital libraries
- text mining
- unsupervised learning