Login / Signup
Investigating Data Contamination in Modern Benchmarks for Large Language Models.
Chunyuan Deng
Yilun Zhao
Xiangru Tang
Mark Gerstein
Arman Cohan
Published in:
NAACL-HLT (2024)
Keyphrases
</>
language model
training data
probabilistic model
language modeling
feature selection
speech recognition
document retrieval
machine learning
information retrieval
language modelling