Consideration of parallel data processing over an apache spark cluster.
Kasumi KatoAtsuko TakefusaHidemoto NakadaMasato OguchiPublished in: IEEE BigData (2017)
Keyphrases
- data processing
- open source
- data management
- clustering algorithm
- parallel processing
- map reduce
- hierarchical clustering
- data analysis
- data points
- open source software
- cluster analysis
- hierarchical structure
- open source projects
- parallel computation
- parallel implementation
- data clustering
- data sets
- shared memory
- information systems
- massively parallel
- data mining
- real world
- databases