Selecting the Appropriate Data Sampling Approach for Imbalanced and High-Dimensional Bioinformatics Datasets.
David J. DittmanTaghi M. KhoshgoftaarAmri NapolitanoPublished in: BIBE (2014)
Keyphrases
- data sets
- data collection
- high dimensional
- database
- data analysis
- raw data
- training data
- data points
- image data
- data processing
- missing data
- high dimensional data
- training dataset
- original data
- data distribution
- test data
- data mining techniques
- data structure
- data mining algorithms
- statistical methods
- high dimensionality
- data mining
- sampling methods
- imbalanced data
- imbalanced datasets