Login / Signup
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism.
Yifan Song
Guoyin Wang
Sujian Li
Bill Yuchen Lin
Published in:
CoRR (2024)
Keyphrases
</>
evaluation method
artificial intelligence
feature selection
gold standard
evaluation framework
greedy strategy
database
preprocessing
scheduling problem
comparative evaluation