Privacy-Preserving Data Deduplication for Enhancing Federated Learning of Language Models.
Aydin AbadiVishnu Asutosh DasuSumanta SarkarPublished in: CoRR (2024)
Keyphrases
- privacy preserving
- language model
- record linkage
- data privacy
- data sources
- privacy sensitive
- private data
- vertically partitioned data
- privacy requirements
- sensitive data
- sensitive information
- data analysis
- language modeling
- private information
- distributed data
- document retrieval
- preserving privacy
- privacy preservation
- information retrieval
- privacy guarantees
- horizontally partitioned data
- data quality
- high dimensional data
- privacy preserving data mining
- privacy issues
- privacy concerns
- data sharing
- test collection
- data collection
- digital libraries