TopSig: A Scalable System for Hashing and Retrieving Document Signatures.
Timothy ChappellShlomo GevaPublished in: AIRS (2015)
Keyphrases
- signature file
- information retrieval
- document images
- inverted file
- document clustering
- document collections
- document classification
- vector space model
- web documents
- information retrieval systems
- binary codes
- file organization
- inverted index
- document image retrieval
- signature verification
- highly scalable
- hash functions
- hashing algorithm
- text documents
- text categorization
- information extraction
- data structure
- indexing techniques
- document representation
- query evaluation
- random projections
- structured documents
- web scale
- similarity search
- document analysis
- index structure
- query expansion
- distance measure