A new resampling method of imbalanced large data based on class boundary.
Sheng XingJunhai ZhaiXiaolan WangMing YuanPublished in: ICMLC (2015)
Keyphrases
- synthetic data
- data sets
- input data
- noisy data
- data collection
- missing values
- pairwise
- data analysis
- prior knowledge
- statistical significance
- image data
- prior information
- database
- detection method
- missing data
- test data
- rare events
- classification trees
- similarity measure
- statistical methods
- data distribution
- clustering method
- xml documents
- data structure
- preprocessing
- support vector machine
- training dataset
- probability distribution
- data sources