Login / Signup
Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models.
Yihong Dong
Xue Jiang
Huanyu Liu
Zhi Jin
Ge Li
Published in:
CoRR (2024)
Keyphrases
</>
language model
training data
knowledge discovery
statistical language models
probabilistic model
language modeling
n gram
document retrieval
query expansion