InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning.
Ziheng QinKai WangZangwei ZhengJianyang GuXiangyu PengDaquan ZhouYang YouPublished in: CoRR (2023)
Keyphrases
- data sets
- labelled data
- data processing
- data collection
- input data
- raw data
- synthetic data
- data sources
- high quality
- training data
- data analysis
- prior knowledge
- training dataset
- knowledge discovery
- data distribution
- sensor data
- database
- neural network
- experimental data
- missing data
- training examples
- dynamic environments
- statistical analysis
- web pages
- image data
- data points
- probability distribution
- database systems