Streaming state management methods for real-time data deduplication.
João Victor EstevesSérgio LifschitzRosa Maria CostaAna Carolina AlmeidaPublished in: SBBD (2020)
Keyphrases
- real time
- data sets
- data processing
- high dimensional data
- data mining techniques
- image data
- preprocessing
- database
- statistical methods
- data analysis
- data mining methods
- data structure
- data reduction
- input data
- original data
- data acquisition
- missing values
- human experts
- synthetic data
- noisy data
- streaming data
- continuous stream
- data distribution
- information systems
- training data
- data points
- end users
- data sources
- data management
- management system
- spectral clustering
- probability distribution
- information management
- multiple sources
- significant improvement
- complex structures
- data cleaning
- data streams