A Primitive Operator for Similarity Joins in Data Cleaning.
Surajit ChaudhuriVenkatesh GantiRaghav KaushikPublished in: ICDE (2006)
Keyphrases
- data cleaning
- similarity join
- data integration
- outlier detection
- data quality
- record linkage
- similarity search
- edit distance
- text classification
- metric space
- data processing
- uncertain data
- database
- data warehousing
- join algorithms
- missing values
- data warehouse
- fraud detection
- xml data
- information extraction
- machine learning
- similar objects
- high dimensional
- decision trees
- web usage mining
- data management
- data sources
- data sets