A Supervised Machine Learning Approach for Duplicate Detection over Gazetteer Records.
Bruno MartinsPublished in: GeoS (2011)
Keyphrases
- duplicate detection
- supervised machine learning
- record linkage
- labeled training examples
- data cleaning
- manually annotated
- named entities
- graph search
- active learning
- supervised learning
- privacy preserving
- databases
- ground truth
- database
- linked data
- natural language processing
- information extraction
- e learning
- feature selection
- data sets