Transcribing and Translating Bilingual Text using OCR Tesseract and Deep Learning.
Patrick D. CernaRhodessa J. CascaroKhing Dave E. LaurenteJoshua Victor B. CabahugDeus William B. CarinoJakob Hans MaraguinotPublished in: ICEIT (2024)
Keyphrases
- deep learning
- printed documents
- unsupervised learning
- unsupervised feature learning
- document analysis
- optical character recognition
- document images
- machine learning
- machine translation
- character recognition
- keywords
- deep architectures
- information retrieval
- mental models
- text mining
- pattern recognition
- multiscale
- active learning
- graph cuts
- weakly supervised
- text lines
- image segmentation
- clustering algorithm