A study of the behavior of several methods for balancing machine learning training data.
Gustavo E. A. P. A. BatistaRonaldo C. PratiMaria Carolina MonardPublished in: SIGKDD Explor. (2004)
Keyphrases
- machine learning
- training data
- machine learning methods
- preprocessing
- supervised learning
- statistical methods
- cross validation
- significant improvement
- machine learning algorithms
- computer vision
- pattern recognition
- noisy data
- computational cost
- experimental design
- data sets
- inductive learning
- benchmark datasets
- labeled data
- text classification
- domain knowledge
- learning algorithm
- data mining