Login / Signup

Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation.

Chunyuan DengYilun ZhaoYuzhao HengYitong LiJiannan CaoXiangru TangArman Cohan
Published in: CoRR (2024)
Keyphrases
  • language model
  • language modeling
  • probabilistic model
  • n gram
  • vector space model
  • decision trees
  • speech recognition