Hairetes: A Search Engine for OCR Documents.
Kazem TaghvaJeffrey S. CoombsPublished in: Document Analysis Systems (2002)
Keyphrases
- search engine
- information retrieval
- keywords
- document processing
- printed documents
- scanned documents
- user queries
- retrieval systems
- document analysis
- document collections
- optical character recognition
- document images
- ocr systems
- retrieval functions
- page layout
- web search engines
- error correction
- current web search engines
- scanned images
- web retrieval
- metadata
- document image retrieval
- web documents
- internet search engines
- xml documents
- document clustering
- relevance ranking
- relevant documents
- ranked list
- retrieved documents
- web pages
- character recognition
- web search
- information retrieval systems
- text documents
- metasearch engine
- document classification
- preprocessing
- post processing
- helping users
- document retrieval
- digital libraries
- search result
- query logs
- keyword search
- search queries
- web information
- multimedia documents
- vector space model