• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Non-Repeatable Experiments and Non-Reproducible Results: The Reproducibility Crisis in Human Evaluation in NLP.

Anya BelzCraig ThomsonEhud ReiterSimon Mille
Published in: ACL (Findings) (2023)
Keyphrases
  • natural language processing
  • information extraction
  • evaluation methods
  • databases
  • artificial intelligence
  • database
  • data sets
  • neural network
  • information systems
  • natural language
  • human subjects
  • evaluation method