Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines.
Michael KuchnikAna KlimovicJiri SimsaVirginia SmithGeorge AmvrosiadisPublished in: MLSys (2022)
Keyphrases
- machine learning
- data sets
- raw data
- database
- data processing
- original data
- training data
- background knowledge
- data analysis
- data collection
- data distribution
- missing data
- data structure
- high quality
- information extraction
- data points
- neural network
- statistical analysis
- synthetic data
- data acquisition
- prior knowledge
- spatial data
- data objects
- knowledge discovery
- data mining
- statistical methods
- experimental data
- sensor data
- privacy preserving
- high dimensional data
- computer systems
- data sources
- image data