Training on the Test Task Confounds Evaluation and Emergence.
Ricardo Dominguez-OlmedoFlorian E. DornerMoritz HardtPublished in: CoRR (2024)
Keyphrases
- evaluation process
- training algorithm
- evaluation criteria
- information technology
- training set
- supervised learning
- active learning
- comparative evaluation
- training phase
- evaluation method
- test cases
- software development
- real time
- expert systems
- data structure
- training data
- e learning
- learning algorithm
- machine learning