On the nonlinear correlation of ML performance between data subpopulations.
Weixin LiangYining MaoYongchan KwonXinyu YangJames ZouPublished in: CoRR (2023)
Keyphrases
- data sets
- data collection
- synthetic data
- correlation analysis
- application domains
- data analysis
- complex data
- raw data
- training data
- data points
- real time
- data quality
- computer systems
- data processing
- database
- databases
- data structure
- high quality
- information retrieval
- data mining algorithms
- big data
- neural network
- data objects
- network structure
- experimental data
- decision trees
- probability distribution
- spatial data
- missing data
- image sequences
- video sequences
- image data
- knowledge discovery
- end users