Clustering tagged documents with labeled and unlabeled documents.
Chien-Liang LiuWen-Hoar HsaioChia-Hoang LeeChun-Hsien ChenPublished in: Inf. Process. Manag. (2013)
Keyphrases
- unlabeled documents
- labeled documents
- text classification
- document classification
- training documents
- text classifiers
- text categorization
- semi supervised learning
- unlabeled data
- unsupervised learning
- clustering algorithm
- naive bayes
- labeled data
- clustering method
- text documents
- bayes classifier
- text data
- document clustering
- machine learning
- bag of words
- feature selection
- vector space
- supervised learning algorithms
- classification algorithm
- text mining
- knn
- class labels
- semi supervised
- supervised learning
- data points
- k means
- web documents
- k nearest neighbor
- data sets
- term frequency
- training data
- document set
- natural language processing
- accurate classifiers
- training corpus
- similarity search