DiT: Self-supervised Pre-training for Document Image Transformer.
Junlong LiYiheng XuTengchao LvLei CuiCha ZhangFuru WeiPublished in: ACM Multimedia (2022)
Keyphrases
- document images
- document image analysis
- document analysis
- document image understanding
- language identification
- document processing
- optical character recognition
- printed documents
- scanned documents
- page segmentation
- page layout
- image processing
- document layout
- historical documents
- scanned document images
- ocr systems
- line extraction
- gray level
- object recognition