DPDLLM: A Black-box Framework for Detecting Pre-training Data from Large Language Models.
Baohang ZhouZezhong WangLingzhi WangHongru WangYing ZhangKehui SongXuhui SuiKam-Fai WongPublished in: ACL (Findings) (2024)
Keyphrases
- black box
- language model
- training data
- probabilistic model
- language modeling
- n gram
- hybrid systems
- language modelling
- query expansion
- black boxes
- document retrieval
- information retrieval
- white box
- relevance model
- test set
- decision trees
- feature selection
- data sets
- test data
- context sensitive
- test collection
- speech recognition
- query language
- integration testing