Login / Signup
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate.
Steffi Chern
Ethan Chern
Graham Neubig
Pengfei Liu
Published in:
CoRR (2024)
Keyphrases
</>
language model
evaluation criteria
probabilistic model
document retrieval
language modeling
search engine
decision trees
hidden markov models
retrieval model
evaluation measures