Data Pipeline Quality: Influencing Factors, Root Causes of Data-related Issues, and Processing Problem Areas for Developers.
Harald FoidlValentina GolendukhinaRudolf RamlerMichael FeldererPublished in: CoRR (2023)
Keyphrases
- data sets
- data processing
- data quality
- data sources
- data collection
- database
- databases
- training data
- high quality
- data structure
- data analysis
- data distribution
- data points
- complex data
- synthetic data
- computer systems
- image data
- knowledge discovery
- prior knowledge
- database systems
- input data
- missing data
- sensor data
- probability distribution
- data acquisition
- raw data
- original data