Login / Signup

PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models.

Huixuan ZhangYun LinXiaojun Wan
Published in: CoRR (2024)
Keyphrases