On Data Imbalance in Molecular Property Prediction with Pre-training.
Limin WangMasatoshi HanaiToyotaro SuzumuraShun TakashigeKenjiro TauraPublished in: CoRR (2023)
Keyphrases
- data sets
- data sources
- database
- data collection
- training dataset
- original data
- data distribution
- data analysis
- training set
- synthetic data
- data processing
- knowledge discovery
- meteorological data
- data quality
- high dimensional data
- image data
- end users
- training data
- data points
- labeled data
- active learning
- prediction accuracy
- xml documents
- spatial data
- data structure
- network structure
- high quality
- machine learning
- predictive model
- sampling methods
- neural network