A K-means Improved CTGAN Oversampling Method for Data Imbalance Problem.
Chunsheng AnJingtong SunYifeng WangQingjie WeiPublished in: QRS (2021)
Keyphrases
- synthetic data
- k means
- data sets
- data collection
- test data
- detection method
- training data
- prior knowledge
- input data
- image data
- clustering method
- noisy data
- missing data
- data analysis
- prior information
- statistical methods
- missing values
- segmentation method
- similarity measure
- database
- objective function
- significant improvement
- sampling methods
- training samples
- information loss
- spectral clustering
- preprocessing
- original data
- data distribution
- decision trees
- association rules
- xml documents
- feature selection
- data sources
- class imbalance
- knn