LayoutLM: Pre-training of Text and Layout for Document Image Understanding.
Yiheng XuMinghao LiLei CuiShaohan HuangFuru WeiMing ZhouPublished in: CoRR (2019)
Keyphrases
- document image understanding
- document images
- text retrieval
- keywords
- training examples
- test set
- database
- text mining
- page layout
- document layout
- training corpus
- training set
- information retrieval
- free text
- training process
- training algorithm
- text documents
- text information
- document analysis
- text classifiers
- training phase
- semantic information
- supervised learning
- support vector machine
- data sets