OCR with Word Prediction Technique for Bilingual Documents.
Supachai TangwongsanBuntida SuvacharakultonPublished in: ACIS-ICIS (2012)
Keyphrases
- printed documents
- multiword
- page layout
- optical character recognition
- parallel corpus
- document image retrieval
- word spotting
- document images
- document processing
- document analysis
- parallel corpora
- word pairs
- character recognition
- bilingual lexicon
- indian languages
- word frequencies
- character n grams
- source language
- scanned documents
- bilingual dictionaries
- english chinese
- recognition errors
- keywords
- cross language information retrieval
- text lines
- printed text
- text corpus
- historical manuscripts
- language independent
- term frequency
- document collections
- ocr systems
- word alignment
- word frequency
- information retrieval
- n gram
- document clustering
- sentence level
- machine translation
- cross language
- word sense disambiguation
- word level
- text documents
- cross lingual
- machine translation system
- document retrieval
- query expansion
- text corpora
- query translation
- sentence pairs
- search engine
- retrieval systems