Proving Test Set Contamination in Black-Box Language Models.
Yonatan OrenNicole MeisterNiladri S. ChatterjiFaisal LadhakTatsunori HashimotoPublished in: ICLR (2024)
Keyphrases
- black box
- test set
- language model
- test cases
- language modeling
- error rate
- document retrieval
- training set
- n gram
- black boxes
- speech recognition
- probabilistic model
- retrieval model
- test data
- white box
- information retrieval
- query expansion
- statistical language models
- training data
- language modelling
- smoothing methods
- test collection
- query terms
- integration testing
- white box testing
- relevance model
- computer vision
- translation model
- object detection
- information extraction
- video sequences
- multimedia
- language models for information retrieval