An Empirical Study on Data Balancing in Machine Learning Based Software Traceability Methods.
Bangchao WangZihan WangHongyan WanXingfu LiYang DengPublished in: IJCNN (2023)
Keyphrases
- machine learning
- statistical methods
- data sets
- data analysis
- high dimensional data
- data collection
- computer systems
- high quality
- data points
- human experts
- knowledge discovery
- data mining applications
- image data
- data mining methods
- software architecture
- training data
- synthetic data
- predictive model
- raw data
- noisy data
- missing values
- data reduction
- machine learning methods
- software systems
- statistical analysis
- data processing
- data sources
- significant improvement
- preprocessing
- clustering algorithm
- missing data
- data mining techniques
- data warehouse
- spectral clustering
- case study
- multiple sources
- decision trees
- machine learning approaches