A novel method for mining highly imbalanced high-throughput screening data in PubChem.
Qingliang LiYanli WangStephen H. BryantPublished in: Bioinform. (2009)
Keyphrases
- high throughput
- data sets
- knowledge discovery
- test data
- data collection
- data analysis
- prior knowledge
- data mining techniques
- database
- classification accuracy
- training data
- support vector machine svm
- data acquisition
- statistical methods
- data points
- genome wide
- high dimensional data
- error rate
- genomic data
- proteomic data
- biological data
- pattern classification
- data distribution
- training samples
- support vector machine
- machine learning
- data mining