Investigating Unsupervised Learning for Text Categorization Bootstrapping.
Alfio Massimiliano GliozzoCarlo StrapparavaIdo DaganPublished in: HLT/EMNLP (2005)
Keyphrases
- text categorization
- unsupervised learning
- text classification
- semi supervised learning
- feature selection
- supervised learning
- semi supervised
- multi label
- unlabeled data
- information gain
- information extraction
- automated text categorization
- naive bayes
- feature weighting
- text documents
- k nearest neighbor
- tf idf
- reuters corpus
- knn
- object recognition
- document categorization
- text classifiers
- term frequency
- labeled data
- term weighting
- automatic text categorization
- named entities
- dimensionality reduction
- machine learning
- expectation maximization
- text mining
- reinforcement learning
- transductive support vector machine
- training documents
- term selection
- n gram
- decision trees
- information retrieval