Login / Signup
Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation.
Aparna Elangovan
Jiayuan He
Karin Verspoor
Published in:
CoRR (2021)
Keyphrases
</>
data sets
raw data
high quality
database
data analysis
data processing
historical data
noisy data
original data
experimental data
data collection
input data
data mining techniques
small number
data sources
decision trees
probability distribution
natural language
artificial intelligence
complex data
data mining