A large empirical assessment of the role of data balancing in machine-learning-based code smell detection.
Fabiano PecorelliDario Di NucciCoen De RooverAndrea De LuciaPublished in: J. Syst. Softw. (2020)
Keyphrases
- machine learning
- data sets
- computer vision
- raw data
- data processing
- image data
- data analysis
- original data
- databases
- training data
- high quality
- data structure
- data quality
- sensor data
- data sources
- machine learning algorithms
- statistical methods
- synthetic data
- high dimensional data
- statistical analysis
- data collection
- small number
- data mining techniques
- database
- information extraction
- data points
- active learning
- clustering algorithm
- sensor networks
- detection method
- background knowledge
- xml documents
- machine learning methods
- high dimensional
- database systems
- automatic detection
- neural network
- big data