Optimized Dual Threshold Entity Resolution For Electronic Health Record Databases - Training Set Size And Active Learning.
Erel JoffeMichael J. ByrnePhillip ReederJorge R. HerskovicCraig W. JohnsonAllison B. McCoyElmer V. BernstamPublished in: AMIA (2013)
Keyphrases
- entity resolution
- active learning
- training set size
- electronic health records
- databases
- training set
- data integration
- record linkage
- generalization error
- information extraction
- query processing
- clinical data
- data sources
- data cleaning
- knowledge discovery
- clinical trials
- supervised learning
- link prediction
- training data
- database systems
- database
- data warehouse
- semi supervised
- maximum likelihood
- labeled data
- training examples
- health care
- class distribution
- machine learning
- data sets
- experimental design
- support vector machine
- classification accuracy
- cost sensitive
- semi structured
- cross validation
- semi supervised learning
- data management
- pairwise
- feature space
- learning algorithm