Login / Signup
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics.
Liang Ma
Shuyang Cao
Robert L. Logan IV
Di Lu
Shihao Ran
Ke Zhang
Joel R. Tetreault
Alejandro Jaimes
Published in:
ACL (1) (2023)
Keyphrases
</>
evaluation metrics
evaluation methods
evaluation criteria
real world
pairwise
evaluation methodology
search engine
evaluation model
comparative evaluation
database
data mining
website
evaluation method
quantitative evaluation
causal models