Radically Lower Data-Labeling Costs for Visually Rich Document Extraction Models.
Yichao ZhouJames B. WendtNavneet PottiJing XieSandeep TataPublished in: CoRR (2022)
Keyphrases
- data sets
- synthetic data
- data processing
- experimental data
- database
- statistical methods
- data analysis
- historical data
- prior knowledge
- probabilistic model
- original data
- accurate models
- raw data
- information retrieval
- data collection
- data points
- statistical analysis
- image data
- machine learning
- end users
- high quality
- information extraction
- text categorization
- high dimensional data
- knowledge discovery
- learning models
- xml documents
- total cost
- predictive model