Login / Signup
Text Similarity Measures in a Data Deduplication Pipeline for Customers Records.
Witold Andrzejewski
Bartosz Bebel
Pawel Boinski
Mariusz Sienkiewicz
Robert Wrembel
Published in:
DOLAP (2023)
Keyphrases
</>
data sets
data analysis
similarity measure
data collection
database
data structure
synthetic data
data processing
data quality
raw data
statistical analysis
image data
data points
record linkage
text mining
high dimensional data
euclidean distance
data sources
original data
high quality
historical data