Login / Signup
Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks.
Andrea Sottana
Bin Liang
Kai Zou
Zheng Yuan
Published in:
CoRR (2023)
Keyphrases
</>
language model
evaluation metrics
language modeling
probabilistic model
test collection
precision and recall
speech recognition
document retrieval
information retrieval
query expansion
n gram
learning to rank
context sensitive
vector space model
language modelling
language models for information retrieval