Building LinkedIn's Real-time Activity Data Pipeline.
Ken GoodhopeJoel KoshyJay KrepsNeha NarkhedeRichard ParkJun RaoVictor Yang YePublished in: IEEE Data Eng. Bull. (2012)
Keyphrases
- data sets
- raw data
- real time
- database
- data processing
- synthetic data
- training data
- data analysis
- data quality
- statistical analysis
- noisy data
- human activities
- data acquisition
- experimental data
- spatial data
- sensor data
- data collection
- high speed
- low cost
- end users
- prior knowledge
- computer systems
- high quality
- social networks
- missing data
- input data
- data distribution
- small number
- data points
- machine learning
- web data