A data recipient centered de-identification method to retain statistical attributes.
Tamas S. GalThomas C. TuckerAryya GangopadhyayZhiyuan ChenPublished in: J. Biomed. Informatics (2014)
Keyphrases
- synthetic data
- statistical information
- statistical methods
- correlation analysis
- missing data
- data sets
- high accuracy
- test data
- preprocessing
- detection method
- prior knowledge
- input data
- prior information
- similarity measure
- database
- data collection
- missing values
- cost function
- training data
- clustering method
- segmentation method
- data processing
- statistical significance
- attribute values
- statistical model
- noisy data
- original data
- image data
- significant improvement
- xml documents
- data analysis
- data mining
- em algorithm
- data mining techniques
- data points
- probability distribution
- high quality
- feature extraction
- clustering algorithm
- statistical inference
- correspondence analysis