All aboard the Databus!: Linkedin's scalable consistent change data capture platform.
Shirshanka DasChavdar BotevKapil SurlakerBhaskar GhoshBalaji VaradarajanSunil NagarajDavid ZhangLei GaoJemiah WestermanPhanindra GantiBoris ShkolnikSajid TopiwalaAlexander PachevNaveen SomasundaramSubbu SubramaniamPublished in: SoCC (2012)
Keyphrases
- data sets
- missing data
- real time
- statistical analysis
- data analysis
- historical data
- synthetic data
- high quality
- original data
- knowledge discovery
- application domains
- noisy data
- small number
- data points
- data structure
- complex data
- database
- data quality
- web data
- network structure
- continuously changing
- missing values
- high dimensional data
- data processing
- data mining techniques
- image data
- end users
- training data
- social networks
- machine learning