Login / Signup
Building and Improving an OCR Classifier for Republican Chinese Newspaper Text.
Matthias Arnold
Konstantin Henke
Published in:
DHd (2022)
Keyphrases
</>
chinese text
text recognition
optical character recognition
printed documents
text summarization
document analysis
document processing
ocr systems
text extraction
training data
text retrieval
information retrieval
text mining
keyword extraction
scanned documents
document images
english text
classification algorithm
classification method
support vector machine
character recognition
text data
feature space
text classifiers
feature selection
text lines
page layout
chinese texts
decision trees
text documents
web documents
training samples
training set
writing style
lexical features
preprocessing
keywords
learning algorithm
text information
text regions
error correction
svm classifier
natural language processing