Login / Signup
Large Synthetic Data from the arXiv for OCR Post Correction of Historic Scientific Articles.
Jill P. Naiman
Morgan G. Cosillo
Peter K. G. Williams
Alyssa Goodman
Published in:
CoRR (2023)
Keyphrases
</>
synthetic data
scientific articles
error correction
optical character recognition
topic modeling
scientific literature
document images
data sets
real world
real image data
character recognition
mri data
topic models
data mining
synthetic datasets