Sequence-to-Label Script Identification for Multilingual OCR.
Yasuhisa FujiiKarel DriesenJonathan BaccashAsh HurstAshok C. PopatPublished in: CoRR (2017)
Keyphrases
- character recognition
- digital libraries
- optical character recognition
- post processing
- multi lingual
- cross language
- language independent
- recognition errors
- real world
- text recognition
- printed documents
- document image analysis
- viterbi algorithm
- error correction
- cross lingual
- document images
- input data
- image classification
- feature selection