Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations.
Weixin LiangYining MaoYongchan KwonXinyu YangJames ZouPublished in: ICML (2023)
Keyphrases
- data sets
- data analysis
- database
- complex data
- data collection
- correlation analysis
- raw data
- synthetic data
- knowledge discovery
- small number
- data warehouse
- high accuracy
- maximum likelihood
- data processing
- high dimensional data
- data distribution
- original data
- training data
- prediction accuracy
- data sources
- prior knowledge
- correlation coefficient
- computational complexity
- highly correlated