An automatic linking service of document images reducing the effects of OCR errors with latent semantics.
Renato Bulcão NetoJosé Antonio Camacho GuerreroAlvaro BarreiroJavier ParaparAlessandra A. MacedoPublished in: SAC (2010)
Keyphrases
- document images
- latent semantics
- optical character recognition
- document image analysis
- document analysis
- printed documents
- document processing
- ocr systems
- document image retrieval
- scanned documents
- page layout
- page segmentation
- topic models
- information retrieval
- scanned document images
- historical documents
- scanned images
- search engine
- text lines
- printed text
- shape analysis
- information retrieval systems