Probabilistic parallelisation of blocking non-matched records for big data.
Chenxiao DouDaniel SunYi-Cheng ChenGuoqiang LiJianquan LiuPublished in: IEEE BigData (2016)
Keyphrases
- big data
- record linkage
- cloud computing
- data processing
- high volume
- data intensive
- data management
- business intelligence
- big data analytics
- data analysis
- unstructured data
- vast amounts of data
- massive data
- databases
- data visualization
- social media
- knowledge discovery
- health informatics
- data science
- case study
- data intensive computing
- data stores
- massive datasets
- machine learning
- statistical databases
- data warehousing
- data warehouse
- data mining