New Workflow for QSAR Model Development from Small Data Sets: Small Dataset Curator and Small Dataset Modeler. Integration of Data Curation, Exhaustive Double Cross-Validation, and a Set of Optimal Model Selection Techniques.
Pravin AmbureAgnieszka Gajewicz-SkretnaM. Natália Dias Soeiro CordeiroKunal RoyPublished in: J. Chem. Inf. Model. (2019)
Keyphrases
- model selection
- cross validation
- information criterion
- multivariate regression
- statistical inference
- parameter estimation
- unseen data
- probability distribution
- hyperparameters
- selection criterion
- data sets
- cross validated
- variable selection
- database
- test data
- support vector
- bayesian methods
- sample size
- generalization error
- statistical model
- leave one out cross validation
- regression model
- prior information
- error estimation
- feature set
- meta learning
- prior knowledge
- bayesian information criterion
- closed form
- automatic model selection
- machine learning
- feature selection
- bayesian networks
- active learning
- training set
- em algorithm
- generalization ability
- maximum a posteriori
- missing values
- probabilistic model
- missing data
- mixture model