BART for Post-Correction of OCR Newspaper Text.
Elizabeth SoperStanley FujimotoYen-Yun YuPublished in: W-NUT (2021)
Keyphrases
- text recognition
- error correction
- document processing
- text extraction
- optical character recognition
- printed documents
- document analysis
- document images
- text retrieval
- page layout
- ocr systems
- information retrieval
- scanned documents
- text documents
- text data
- post processing
- text mining
- character recognition
- neural network
- scanned images
- viterbi algorithm
- text information
- preprocessing
- query expansion
- keywords
- text processing
- key concepts
- text categorization
- data sets