A bin and hash method for analyzing reference data and descriptors in machine learning potentials.
Martín Leandro PaleicoJörg BehlerPublished in: Mach. Learn. Sci. Technol. (2021)
Keyphrases
- synthetic data
- statistical methods
- data sets
- machine learning
- missing data
- machine learning methods
- prior knowledge
- test data
- input data
- noisy data
- data points
- database
- data analysis
- missing values
- objective function
- data mining
- knowledge discovery
- support vector machine
- em algorithm
- clustering method
- data structure
- similarity measure
- information loss
- active learning
- prior information
- feature selection
- high order
- data distribution
- segmentation method
- detection method
- graph cuts
- pairwise
- probabilistic model