Data Contamination Report from the 2024 CONDA Shared Task.
Oscar SainzIker García-FerreroAlon JacoviJon Ander CamposYanai ElazarEneko AgirreYoav GoldbergWei-Lin ChenJenny ChimLeshem ChoshenLuca D'Amico-WongMelissa DellRun-Ze FanShahriar GolchinYucheng LiPengfei LiuBhavish PahwaAmeya PrabhuSuryansh SharmaEmily SilcockKateryna SolonkoDavid StapMihai SurdeanuYu-Min TsengVishaal UdandaraoZengzhi WangRuijie XuJinglin YangPublished in: CoRR (2024)
Keyphrases
- data sets
- data collection
- database
- data objects
- high quality
- raw data
- synthetic data
- data analysis
- real time
- complex data
- data quality
- data processing
- end users
- computer systems
- statistical methods
- machine learning
- knowledge discovery
- image data
- data sources
- data streams
- spatial data
- application domains
- missing values
- data mining
- neural network
- databases