Predicting accuracy on large datasets from smaller pilot data.
Mark JohnsonPeter AndersonMark DrasMark SteedmanPublished in: ACL (2) (2018)
Keyphrases
- data sets
- database
- raw data
- data analysis
- high quality
- data collection
- small number
- original data
- data points
- knowledge discovery
- data processing
- experimental conditions
- historical data
- test data
- data sources
- data mining
- computational cost
- data streams
- labeled data
- prediction accuracy
- synthetic data
- data distribution
- data structure
- statistical methods
- training data
- database systems
- data reduction
- training dataset
- neural network
- massive data