Login / Signup
Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents.
Eric Michael Smith
Orion Hsu
Rebecca Qian
Stephen Roller
Y-Lan Boureau
Jason Weston
Published in:
ConvAI@ACL (2022)
Keyphrases
</>
multi agent systems
gold standard
neural network
preprocessing
cooperative
empirical studies
benchmark datasets
evaluation methods
significant improvement
intelligent agents
human experts
evaluation metrics