Login / Signup
Deduplicating Training Data Mitigates Privacy Risks in Language Models.
Nikhil Kandpal
Eric Wallace
Colin Raffel
Published in:
ICML (2022)
Keyphrases
</>
language model
training data
language modeling
n gram
probabilistic model
document retrieval
information retrieval
retrieval model
language modelling
classification accuracy
decision trees
supervised learning
test collection
training set
speech recognition
statistical language models
labeled data
vector space model
context sensitive
language model for information retrieval
query expansion
learning algorithm
passage retrieval
document length
question answering
pseudo relevance feedback
document ranking
ad hoc information retrieval
retrieval effectiveness
translation model
smoothing methods
web search
language models for information retrieval