A middle-ware approach to leverage the distributed data de-duplication capability on HPC and Cloud storage systems.
Hsing-bung ChenSihai TangSong FuPublished in: IEEE BigData (2020)
Keyphrases
- distributed data
- storage systems
- file system
- high performance computing
- distributed data mining
- cloud computing
- data sharing
- computing resources
- data distribution
- storage devices
- communication cost
- flash memory
- data mining algorithms
- database systems
- grid computing
- formal methods
- massively parallel
- parallel computing
- fault tolerance
- object oriented
- knowledge discovery