Login / Signup

Generating Realistic Test Datasets for Duplicate Detection at Scale Using Historical Voter Data.

Fabian PanseAndré DüjonWolfram WingerathBenjamin Wollmer
Published in: EDBT (2021)
Keyphrases
  • data sets
  • database
  • duplicate detection
  • data mining techniques
  • data analysis
  • data quality
  • data processing
  • web data
  • log data
  • data cleaning
  • end users
  • data points