Exploring Maximum Tree Depth and Random Undersampling in Ensemble Trees to Optimize the Classification of Imbalanced Big Data.
John T. HancockTaghi M. KhoshgoftaarPublished in: SN Comput. Sci. (2023)
Keyphrases
- big data
- class imbalance
- decision trees
- tree structure
- classification trees
- imbalanced data
- binary classification problems
- machine learning
- randomized trees
- big data analytics
- cloud computing
- data analysis
- data management
- unstructured data
- training set
- social media
- tree nodes
- predictive modeling
- data science
- feature selection
- cost sensitive
- data processing
- text classification
- tree structures
- massive data
- case study
- training data
- databases
- leaf nodes
- random forests
- class distribution
- active learning
- data warehousing
- business intelligence
- business analytics
- vast amounts of data
- knowledge discovery
- model selection
- support vector machine