UNLV-ISRI document collection for research in OCR and information retrieval.
Kazem TaghvaThomas A. NartkerJulie BorsackAllen ConditPublished in: Document Recognition and Retrieval (2000)
Keyphrases
- document collections
- information retrieval
- information retrieval systems
- document retrieval
- test collection
- relevant documents
- text retrieval
- digital libraries
- optical character recognition
- document representation
- search engine
- cross language
- character recognition
- document clustering
- text collections
- document images
- retrieval systems
- retrieval effectiveness
- information seeking
- information access
- vector space model
- text mining
- xml retrieval
- information extraction
- document clusters
- retrieval model
- document image analysis
- index terms
- inverted file
- term weighting
- latent semantic analysis
- language model
- query expansion
- query terms