ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction.
Zheng HuangKai ChenJianhua HeXiang BaiDimosthenis KaratzasShijian LuC. V. JawaharPublished in: CoRR (2021)
Keyphrases
- scanned documents
- information extraction
- document images
- text detection
- optical character recognition
- text lines
- natural language processing
- precision and recall
- scanned images
- information retrieval
- free text
- noise removal
- named entities
- scanned document images
- relation extraction
- named entity recognition
- machine learning
- natural language
- text documents
- question answering
- document processing
- text mining
- web mining
- web documents
- relational learning
- semi structured
- international competition
- ontology based information extraction
- structured data
- textual data
- machine translation
- video analysis
- conditional random fields
- open domain
- preprocessing
- real world
- character segmentation
- hidden markov models
- document analysis
- outdoor scenes
- text processing
- character recognition