UQlust: combining profile hashing with linear-time ranking for efficient clustering and analysis of big macromolecular data.
Rafal AdamczakJarek MellerPublished in: BMC Bioinform. (2016)
Keyphrases
- nearest neighbor
- high dimensional data
- data points
- nearest neighbor search
- data analysis
- original data
- data sets
- data collection
- statistical analysis
- big data
- data distribution
- spectral clustering
- data processing
- data structure
- knowledge discovery
- training data
- data objects
- raw data
- missing data
- synthetic data
- pairwise
- clustering method
- database
- data mining techniques
- data management
- data sources
- end users
- correlation analysis