Fast Processing and Querying of 170TB of Genomics Data via a Repeated And Merged BloOm Filter (RAMBO).
Gaurav GuptaMinghao YanBenjamin ColemanBryce KilleRyan A. Leo ElworthTharun MediniTodd J. TreangenAnshumali ShrivastavaPublished in: SIGMOD Conference (2021)
Keyphrases
- data processing
- database
- data sets
- data collection
- training data
- data structure
- data analysis
- high quality
- real time
- bloom filter
- data sources
- data mining techniques
- data quality
- query language
- probability distribution
- data points
- knn
- nearest neighbor
- query processing
- knowledge base
- computer systems
- statistical analysis
- high dimensional data
- synthetic data
- multimedia data
- databases