Undersampling Strategy Based on Clustering to Improve the Performance of Splice Site Classification in Human Genes.
Claudia Galarda VarassinAlexandre PlastinoHelena Cristina da Gama LeitãoBianca ZadroznyPublished in: DEXA Workshops (2013)
Keyphrases
- splice site
- microarray gene
- roc analysis
- microarray data analysis
- gene expression profiles
- classification accuracy
- gene expression
- class imbalance
- supervised learning
- unsupervised learning
- sequence analysis
- high dimensionality
- roc curve
- machine learning
- feature space
- feature vectors
- microarray data
- gene expression data
- microarray
- support vector machine
- data sets
- secondary structure
- receiver operating characteristic
- training set
- high dimensional data
- multi class
- decision trees
- alternative splicing
- feature selection