On sampling from data with duplicate records.
Alireza HeidariShrinu KushagraIhab F. IlyasPublished in: CoRR (2020)
Keyphrases
- database
- data sources
- data sets
- training data
- data objects
- data analysis
- synthetic data
- databases
- image data
- statistical analysis
- input data
- noisy data
- raw data
- statistical methods
- experimental data
- missing data
- data collection
- probability distribution
- data processing
- data points
- prior knowledge
- data structure
- high quality
- data cleaning
- random sample