Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening.
Karima SidMohamed BatouchePublished in: Int. J. Data Min. Model. Manag. (2021)
Keyphrases
- virtual screening
- ensemble learning
- distributed heterogeneous
- drug discovery
- similarity searching
- data sources
- generalization ability
- high throughput
- ensemble methods
- binding sites
- random forest
- concept drift
- similarity search
- base classifiers
- scoring function
- unlabeled data
- query processing
- benchmark datasets
- gene expression
- data mining
- systems biology
- feature space
- data analysis
- support vector
- decision trees