Optimizing Data Pipelines for Machine Learning in Feature Stores.
Rui LiuKwanghyun ParkFotis PsallidasXiaoyong ZhuJinghui MoRathijit SenMatteo InterlandiKonstantinos KaranasosYuanyuan TianJesús Camacho-RodríguezPublished in: Proc. VLDB Endow. (2023)
Keyphrases
- machine learning methods
- machine learning
- statistical methods
- data sets
- synthetic data
- data analysis
- data sources
- data quality
- data collection
- data processing
- prior knowledge
- sensor data
- data mining techniques
- experimental data
- data structure
- training data
- raw data
- computer vision
- big data
- missing values
- knowledge discovery
- semi supervised learning
- image data
- data points
- probability distribution
- xml documents
- decision trees
- feature selection
- data mining