Classification-Based Record Linkage With Pseudonymized Data for Epidemiological Cancer Registries.
Yannik SiegertXiaoyi JiangVolker KriegSebastian BartholomäusPublished in: IEEE Trans. Multim. (2016)
Keyphrases
- record linkage
- data analysis
- data sets
- database
- machine learning
- data cleaning
- missing data
- cancer datasets
- original data
- end users
- classification accuracy
- support vector machine
- data warehouse
- text classification
- data processing
- support vector
- training data
- data quality
- metadata
- feature selection
- duplicate detection
- multiple databases
- census data
- artificial intelligence