Clustering of highly homologous sequences to reduce the size of large protein databases.

Weizhong Li Lukasz Jaroszewski Adam Godzik

Published in: Bioinform. (2001)

Keyphrases

clustering algorithm
k means
hidden markov models
sequence alignment
self organizing maps
document clustering
data clustering
spectral clustering
small size
unsupervised learning
clustering method
graph theoretic
maintenance cost
database
significantly reduced
comparative analysis
dissimilarity measure
cluster analysis
data points
high dimensional
pairwise