Improving data scientist efficiency with provenance.
Jingmei HuJiwon JoungMaia JacobsKrzysztof Z. GajosMargo I. SeltzerPublished in: ICSE (2020)
Keyphrases
- data quality
- data sets
- original data
- database
- high quality
- data structure
- image data
- noisy data
- raw data
- synthetic data
- high dimensional data
- data points
- input data
- complex data
- data processing
- data collection
- computer systems
- missing data
- small number
- information systems
- social media
- prior knowledge
- data analysis
- information retrieval
- dimensionality reduction
- metadata
- training data
- missing values
- experimental data
- application domains
- data mining algorithms
- attribute values
- feature space
- statistical analysis
- data sources
- end users