A CKAN Plugin for Data Harvesting to the Hadoop Distributed File System.
Robert ScholzNikolay TcholtchevPhilipp LämmelIna SchieferdeckerPublished in: CLOSER (2017)
Keyphrases
- data sets
- big data
- statistical analysis
- open source
- data quality
- data analysis
- knowledge discovery
- synthetic data
- database
- sensor data
- data objects
- probability distribution
- databases
- experimental data
- data distribution
- search engine
- data processing
- image data
- data points
- data acquisition
- network structure
- statistical methods
- noisy data
- data collection
- missing data
- input data
- machine learning
- small number
- data sources
- xml documents
- training data
- database systems