Distributed subdata selection for big data via sampling-based approach.
Haixiang ZhangHaiYing WangPublished in: Comput. Stat. Data Anal. (2021)
Keyphrases
- big data
- data intensive
- data management
- cloud computing
- data analysis
- social media
- unstructured data
- high volume
- distributed systems
- commodity hardware
- data intensive computing
- vast amounts of data
- data processing
- big data analytics
- massive data
- business intelligence
- data visualization
- knowledge discovery
- mobile agents
- data analytics
- distributed environment
- database systems
- web services
- data science
- data mining
- data sets
- peer to peer
- open source
- text mining
- end users
- massive datasets
- information systems