Optimizing ETL by a Two-Level Data Staging Method.
Xiufeng LiuNadeem IftikharHuan HuoPer Sieverts NielsenPublished in: Int. J. Data Warehous. Min. (2016)
Keyphrases
- synthetic data
- input data
- statistical methods
- test data
- raw data
- missing data
- data sets
- noisy data
- knowledge discovery
- high quality
- preprocessing
- data quality
- prior information
- missing values
- high accuracy
- prior knowledge
- statistical significance
- information loss
- spectral clustering
- training samples
- data sources
- similarity measure
- correlation analysis
- significant improvement
- database
- dynamic programming
- original data
- training data
- image data
- data structure
- support vector machine
- statistical analysis
- feature space
- medical images
- data mining
- detection method
- cost function
- em algorithm
- k means
- data processing