A distributed evolutionary based instance selection algorithm for big data using Apache Spark.
Liyang QinXiaoli WangLinzi YinZhaohui JiangPublished in: Appl. Soft Comput. (2024)
Keyphrases
- selection algorithm
- big data
- data intensive
- cloud computing
- big data analytics
- data analysis
- prototype selection
- data management
- resource selection
- open source
- commodity hardware
- map reduce
- data processing
- vast amounts of data
- social media
- data science
- business intelligence
- unstructured data
- distributed environment
- search algorithm
- genetic algorithm
- information processing
- data warehousing
- information systems
- massive datasets
- data analytics
- support vector machine svm
- feature subset
- neural network
- genetic programming
- peer to peer
- management system
- knowledge discovery
- feature space
- information retrieval