PLUTUS: Understanding Data Distribution Tailoring for Machine Learning.
Jiwon ChangChristina DionysioFatemeh NargesianMatthias BoehmPublished in: SIGMOD Conference Companion (2024)
Keyphrases
- data distribution
- machine learning
- data streams
- index structure
- training instances
- high dimensional data
- data points
- machine learning algorithms
- streaming data
- communication cost
- active learning
- learning tasks
- machine learning methods
- distributed data
- computer vision
- learning algorithm
- concept drift
- multi dimensional
- decision boundary
- r tree
- nearest neighbor
- pattern recognition
- multi dimensional data
- data skew
- database
- data analysis
- decision trees
- databases