Login / Signup

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism.

Yifan SongGuoyin WangSujian LiBill Yuchen Lin
Published in: CoRR (2024)
Keyphrases
  • evaluation method
  • artificial intelligence
  • feature selection
  • gold standard
  • evaluation framework
  • greedy strategy
  • database
  • preprocessing
  • scheduling problem
  • comparative evaluation