Document Sanitization: Measuring Search Engine Information Loss and Risk of Disclosure for the Wikileaks cables.
David F. NettletonDaniel AbrilPublished in: Privacy in Statistical Databases (2012)
Keyphrases
- information loss
- search engine
- disclosure risk
- information retrieval
- retrieval systems
- keywords
- user queries
- statistical disclosure control
- confidential information
- privacy protection
- data quality
- extra information
- web search
- web search engines
- relevance ranking
- data anonymization
- information retrieval systems
- confidential data
- data publishing
- search queries
- database
- categorical attributes
- query terms
- decision table
- relevant documents
- data warehouse
- data sources
- data model
- databases