Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models.
Yichao ZhouJames B. WendtNavneet PottiJing XieSandeep TataPublished in: EMNLP (2023)
Keyphrases
- data sets
- database
- data analysis
- training data
- raw data
- prior knowledge
- data collection
- experimental data
- statistical analysis
- input data
- image data
- image segmentations
- data objects
- document images
- data mining
- accurate models
- data extraction
- historical data
- machine learning
- data structure
- statistical methods
- feature space
- knowledge discovery
- synthetic data
- high dimensional data
- document collections
- xml documents
- unsupervised learning
- data sources
- data processing
- data points