SCTWC: An online semi-supervised clustering approach to topical web crawlers.
Huaxiang ZhangJing LuPublished in: Appl. Soft Comput. (2010)
Keyphrases
- semi supervised clustering
- web crawlers
- semi supervised
- topic specific
- metric learning
- pairwise constraints
- web crawling
- unsupervised clustering
- online social
- semi supervised classification
- web crawler
- background knowledge
- search engine
- semi supervised learning
- web applications
- clustering algorithm
- information retrieval systems
- nearest neighbor
- web pages
- data sets
- web documents
- distance function
- unlabeled data
- nonnegative matrix factorization
- web sources
- probabilistic model
- prior knowledge
- decision trees