Evolutionary undersampling for extremely imbalanced big data classification under apache spark.
Isaac TrigueroMikel GalarD. MerinoJesús MailloHumberto BustinceFrancisco HerreraPublished in: CEC (2016)
Keyphrases
- big data
- class imbalance
- cloud computing
- data management
- open source
- vast amounts of data
- data analysis
- supervised learning
- big data analytics
- data science
- class labels
- machine learning
- social media
- information retrieval
- data processing
- decision trees
- support vector machine
- training data
- business intelligence
- high volume
- information systems
- real world
- knowledge discovery
- massive data
- unstructured data
- class distribution
- cost sensitive
- feature selection
- data warehousing
- decision making
- text classification
- training set
- management system