Login / Signup
Detecting Pretraining Data from Large Language Models.
Weijia Shi
Anirudh Ajith
Mengzhou Xia
Yangsibo Huang
Daogao Liu
Terra Blevins
Danqi Chen
Luke Zettlemoyer
Published in:
CoRR (2023)
Keyphrases
</>
language model
probabilistic model
language modeling
training data
n gram
co occurrence
machine translation