Challenges in OCR of Dev anagari Documents.
Suryaprakash KompalliSankalp NayakSrirangaraj SetlurVenu GovindarajuPublished in: ICDAR (2005)
Keyphrases
- printed documents
- document processing
- optical character recognition
- scanned documents
- document analysis
- document images
- ocr systems
- page layout
- information retrieval
- error correction
- character recognition
- lessons learned
- document collections
- intelligence community
- xml documents
- web documents
- word spotting
- legal documents
- preprocessing
- document representation
- document image retrieval
- metadata
- post processing
- document classification
- information retrieval systems
- scanned images
- free text
- keywords
- digital libraries
- database
- multimedia documents
- language independent
- enterprise search
- text analysis
- document centric
- multimedia
- document clustering
- historical manuscripts
- vector space