Identifying the potential of Near Data Computing for Apache Spark.
Ahsan Javed AwanMats BrorssonVladimir VlassovEduard AyguadéPublished in: CoRR (2017)
Keyphrases
- data sets
- high quality
- data analysis
- raw data
- complex data
- data processing
- open source
- small number
- query processing
- training data
- data sources
- multimedia data
- data distribution
- synthetic data
- high dimensional data
- statistical analysis
- information retrieval
- data structure
- data collection
- decision trees
- knowledge discovery
- experimental data
- data points
- data acquisition
- temporal information
- databases
- original data
- noisy data
- data quality
- real time