Cross-validation pitfalls when selecting and assessing regression and classification models.
Damjan KrstajicLjubomir J. ButurovicDavid E. LeahySimon ThomasPublished in: J. Cheminformatics (2014)
Keyphrases
- cross validation
- classification models
- model selection
- regression problems
- feature selection
- support vector
- variable selection
- decision trees
- hyperparameters
- regression model
- training data
- generalization error
- training set
- cross validated
- feature set
- information criterion
- base learners
- error estimates
- gaussian process
- linear regression
- models built
- feature subset
- sample size
- nearest neighbor classifiers
- support vector regression
- selection algorithm
- unseen data
- genetic programming