Detection of data drift and outliers affecting machine learning model performance over time.
Samuel AckermanEitan FarchiOrna RazMarcel ZalmanoviciParijat DubePublished in: CoRR (2020)
Keyphrases
- machine learning
- data sets
- experimental data
- data analysis
- statistical methods
- probabilistic model
- data points
- empirical data
- database
- expert knowledge
- data structure
- computational model
- synthetic data
- measured data
- raw data
- data processing
- mathematical model
- missing data
- predictive model
- knowledge discovery
- training data
- anomaly detection
- data collection
- input data
- data mining techniques
- data sources
- prior knowledge
- high quality
- data quality
- learning models
- simulation data
- high dimensional data
- detection algorithm
- labeled data
- clustering algorithm