Sign in

Don't Make Your LLM an Evaluation Benchmark Cheater.

Kun ZhouYutao ZhuZhipeng ChenWentong ChenWayne Xin ZhaoXu ChenYankai LinJi-Rong WenJiawei Han
Published in: CoRR (2023)
Keyphrases
  • comparative evaluation
  • high level
  • empirical evaluation
  • evaluation methods
  • real world
  • cooperative
  • multi agent systems
  • ground truth
  • evaluation method
  • comparative analysis
  • evaluation process