Benchmarking Apache Spark and Hadoop MapReduce on Big Data Classification.
Taha TekdoganAli CakmakPublished in: CoRR (2022)
Keyphrases
- big data
- cloud computing
- data intensive
- data analytics
- map reduce
- data analysis
- data management
- data intensive computing
- unstructured data
- open source
- big data analytics
- business intelligence
- predictive modeling
- high volume
- data processing
- data science
- vast amounts of data
- social media
- data warehousing
- machine learning
- supervised learning
- distributed computing
- massive data
- text classification
- feature selection
- parallel algorithm
- information processing
- commodity hardware
- knowledge discovery