Less is More: An Exploration of Data Redundancy with Active Dataset Subsampling.
Kashyap ChittaJose M. AlvarezElmar HaussmannClément FarabetPublished in: CoRR (2019)
Keyphrases
- data sets
- database
- data processing
- raw data
- synthetic data
- data analysis
- data collection
- data sources
- training data
- high quality
- redundant data
- data visualization
- complex data
- sensor data
- statistical analysis
- image data
- data structure
- decision trees
- computer systems
- input data
- high dimensional data
- information systems
- data mining techniques
- experimental data
- test data
- statistical methods
- learning algorithm
- original data
- data objects
- historical data
- probability distribution