A Vision for Managing Extreme-Scale Data Hoards.
Jeremy LoganKshitij MehtaGerd HeberScott KlaskyTahsin M. KurçNorbert PodhorszkiPatrick M. WidenerMatthew WolfPublished in: ICDCS (2019)
Keyphrases
- synthetic data
- data collection
- data sets
- data processing
- data structure
- data analysis
- raw data
- high quality
- test data
- computer systems
- database
- global scale
- data quality
- data acquisition
- sensor data
- image processing
- data sources
- data distribution
- feature selection
- historical data
- metadata
- data objects
- statistical methods
- missing values
- website
- prior knowledge
- data mining techniques
- application domains
- machine learning
- high dimensional data
- information sources
- probability distribution
- computer vision
- dimensionality reduction
- input data