Clustering of highly homologous sequences to reduce the size of large protein databases.
Weizhong LiLukasz JaroszewskiAdam GodzikPublished in: Bioinform. (2001)
Keyphrases
- clustering algorithm
- k means
- hidden markov models
- sequence alignment
- self organizing maps
- document clustering
- data clustering
- spectral clustering
- small size
- unsupervised learning
- clustering method
- graph theoretic
- maintenance cost
- database
- significantly reduced
- comparative analysis
- dissimilarity measure
- cluster analysis
- data points
- high dimensional
- pairwise