Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis.
Aradhya V. N. ManjunathG. Hemantha KumarS. NoushathPublished in: Eng. Appl. Artif. Intell. (2008)
Keyphrases
- fourier transform
- principal component analysis
- parallel corpus
- cross language
- indian languages
- ocr systems
- document processing
- multilingual information retrieval
- document images
- optical character recognition
- frequency domain
- cross lingual
- printed documents
- document collections
- scanned documents
- signal processing
- comparable corpora
- cross language information retrieval
- document retrieval
- document analysis
- language independent
- cross language ir
- digital libraries
- parallel corpora
- information retrieval
- page layout
- radon transform
- fourier domain
- machine translation
- information retrieval systems
- fourier analysis
- polar coordinates
- character recognition
- covariance matrix
- independent component analysis
- discrete fourier transform
- arabic documents
- fast fourier transform
- bilingual dictionaries
- frequency spectrum
- pattern recognition
- document clustering
- machine translation system
- feature extraction
- handwriting recognition
- power spectral density
- face recognition
- log polar
- feature vectors
- multiresolution
- natural language
- spectral estimation
- image processing
- query translation
- computer vision