DS-Prox: Dataset Proximity Mining for Governing the Data Lake.
Ayman AlserafiToon CaldersAlberto AbellóOscar RomeroPublished in: SISAP (2017)
Keyphrases
- data analysis
- data sets
- data mining techniques
- knowledge discovery
- data processing
- data collection
- data points
- data quality
- data structure
- database
- synthetic data
- transactional data
- raw data
- high dimensional data
- image data
- training dataset
- social networks
- statistical analysis
- interesting patterns
- high quality
- association rule mining
- data mining algorithms
- data mining tasks
- data sources
- data mining methods
- complex data
- experimental data
- association rules
- historical data
- mining algorithm
- probability distribution
- text mining
- training data
- decision trees
- data management