LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking.
Yupan HuangTengchao LvLei CuiYutong LuFuru WeiPublished in: CoRR (2022)
Keyphrases
- image data
- single image
- image features
- scanned documents
- input image
- text documents
- information retrieval
- image retrieval
- multiscale
- printed documents
- image segmentation
- image representation
- image analysis
- keywords
- image regions
- segmentation method
- machine learning
- web documents
- web images
- empirically derived
- artificial intelligence
- document processing
- image classification
- image content
- high resolution
- low level
- edge detection
- test images
- image collections
- text lines
- text mining
- document content
- digital documents
- textual information
- multimedia documents
- text information
- text content
- similarity measure
- semantic information