An Efficient Bilingual Optical Character Recognition (English-Oriya) System for Printed Documents.
Sanghamitra MohantyHimadri Nandini DasbebarttaTarun Kumar BeheraPublished in: ICAPR (2009)
Keyphrases
- printed documents
- optical character recognition
- machine translation
- language independent
- cross lingual
- cross language
- character n grams
- document images
- english text
- character recognition
- character segmentation
- cross language information retrieval
- document analysis
- target language
- ocr systems
- handwriting recognition
- document processing
- word spotting
- machine printed
- word level
- text retrieval
- scanned documents
- printed text
- natural language
- question answering
- document collections
- text categorization
- document image analysis
- chinese characters