Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning.
Christian ReulUwe SpringmannChristoph WickFrank PuppePublished in: J. Lang. Technol. Comput. Linguistics (2018)
Keyphrases
- active learning
- optical character recognition
- high accuracy
- computational cost
- annotation effort
- scanned documents
- prediction accuracy
- combining multiple
- learning strategies
- imbalanced data classification
- machine learning
- recognition errors
- handwriting recognition
- majority voting
- error rate
- post processing
- learning process
- digital libraries
- computational complexity