Login / Signup

Investigating Data Contamination for Pre-training Language Models.

Minhao JiangKen Ziyu LiuMing ZhongRylan SchaefferSiru OuyangJiawei HanSanmi Koyejo
Published in: CoRR (2024)
Keyphrases
  • language model
  • language modeling
  • information retrieval
  • knowledge discovery
  • training data
  • context sensitive