Machine Learning Pipelines: Provenance, Reproducibility and FAIR Data Principles.
Sheeba SamuelFrank LöfflerBirgitta König-RiesPublished in: CoRR (2020)
Keyphrases
- machine learning
- data sets
- data collection
- knowledge discovery
- training data
- statistical analysis
- image data
- data sources
- data quality
- database
- data points
- prior knowledge
- data analysis
- data structure
- high quality
- artificial intelligence
- missing data
- statistical methods
- neural network
- active learning
- small number
- natural language processing
- decision trees
- high dimensional data
- synthetic data
- data distribution
- data objects