Login / Signup
Don't Make Your LLM an Evaluation Benchmark Cheater.
Kun Zhou
Yutao Zhu
Zhipeng Chen
Wentong Chen
Wayne Xin Zhao
Xu Chen
Yankai Lin
Ji-Rong Wen
Jiawei Han
Published in:
CoRR (2023)
Keyphrases
</>
comparative evaluation
high level
empirical evaluation
evaluation methods
real world
cooperative
multi agent systems
ground truth
evaluation method
comparative analysis
evaluation process