Radically Lower Data-Labeling Costs for Visually Rich Document Extraction Models.

Yichao Zhou James B. Wendt Navneet Potti Jing Xie Sandeep Tata

Published in: CoRR (2022)

Keyphrases

data sets
synthetic data
data processing
experimental data
database
statistical methods
data analysis
historical data
prior knowledge
probabilistic model
original data
accurate models
raw data
information retrieval
data collection
data points
statistical analysis
image data
machine learning
end users
high quality
information extraction
text categorization
high dimensional data
knowledge discovery
learning models
xml documents
total cost
predictive model