Robust Recognition of Degraded Documents Using Character N-Grams.
Shrey DuttaNaveen SankaranK. Pramod SankarC. V. JawaharPublished in: Document Analysis Systems (2012)
Keyphrases
- robust recognition
- character n grams
- n gram
- variable length
- cross language
- information retrieval
- document collections
- cross language information retrieval
- information retrieval systems
- optical character recognition
- document retrieval
- partial occlusion
- metadata
- arabic documents
- question answering
- text retrieval
- vector space model
- vector space
- user queries
- language model
- query terms
- document representation
- relevant documents
- web documents