Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks.
Anas HimmiEkhine IrurozkiNathan NoiryStéphan ClémençonPierre ColomboPublished in: CoRR (2023)
Keyphrases
- natural language processing
- scoring methods
- evaluation methods
- missing data
- robust estimation
- machine translation
- neural network
- wordnet
- natural language
- database systems
- artificial intelligence
- information extraction
- computationally efficient
- image sequences
- partial occlusion
- evaluation metrics
- evaluation model
- comparative evaluation
- text analysis
- text processing
- data mining