CIDACS-RL: a novel indexing search and scoring-based record linkage system for huge datasets with high accuracy and scalability.
George Caique Gouveia BarbosaM. Sanni AliBruno AraújoSandra ReisSamila SenaMaria Yury IchiharaJulia PescariniRosemeire L. FiacconeLeila AmorimRobespierre PitaMarcos E. BarretoLiam SmeethMaurício Lima BarretoPublished in: BMC Medical Informatics Decis. Mak. (2020)
Keyphrases
- record linkage
- duplicate detection
- high accuracy
- search algorithm
- reinforcement learning
- indexing techniques
- database
- efficient indexing
- search space
- high efficiency
- efficient search
- data cleaning
- census data
- privacy preserving
- multiple databases
- indexing method
- record pairs
- markov decision processes
- case study
- artificial intelligence
- data mining
- data sets
- linked data
- function approximation
- data warehouse
- relevance feedback
- information extraction
- database systems
- entity resolution
- learning algorithm
- machine learning