OCR Post Correction for Endangered Language Texts.
Shruti RijhwaniAntonios AnastasopoulosGraham NeubigPublished in: EMNLP (1) (2020)
Keyphrases
- natural language
- error correction
- optical character recognition
- language learning
- native language
- text generation
- preprocessing
- programming language
- document images
- character recognition
- linguistic knowledge
- post processing
- language processing
- text documents
- text recognition
- natural language generation
- error analysis
- printed documents
- document processing
- computational linguistics
- database
- data sets