Make the Most of Your Data: Changing the Training Data Distribution to Improve In-distribution Generalization Performance.
Dang NguyenPaymon HaddadEric GanBaharan MirzasoleimanPublished in: CoRR (2024)
Keyphrases
- data distribution
- data points
- distributed data
- data streams
- high dimensional data
- index structure
- data sets
- streaming data
- data structure
- multi dimensional data
- data analysis
- database
- pattern recognition
- communication cost
- data skew
- training set
- training instances
- decision boundary
- data mining
- training data
- training examples
- distributed systems
- data sources
- query processing
- high dimensional