Experiences with Managing Data Ingestion into a Corporate Datalake.
Sean RooneyDaniel BauerLuis Garcés-EricePeter UrbanetzFlorian FroeseSasa TomicPublished in: CIC (2019)
Keyphrases
- data sets
- raw data
- data quality
- data sources
- data collection
- knowledge discovery
- image data
- data processing
- data mining techniques
- historical data
- original data
- data analysis
- digital data
- database
- high dimensional data
- computer systems
- training data
- decision trees
- data management
- end users
- prior knowledge
- xml documents
- experimental data
- high quality
- data objects
- noisy data
- complex data
- machine learning
- neural network