Text Categorization of Multilingual Web Pages in Specific Domain.
Jicheng LiuChunyan LiangPublished in: PAKDD (2008)
Keyphrases
- text categorization
- web pages
- cross domain
- cross language
- text classification
- multi label
- k nearest neighbor
- knn
- information gain
- feature selection
- website
- transfer learning
- reuters corpus
- naive bayes
- semi supervised learning
- automated text categorization
- web search
- text classifiers
- search engine
- text documents
- automatic text categorization
- tf idf
- term frequency
- feature selection for text categorization
- unlabeled data
- transductive support vector machine
- feature selections
- term weighting
- natural language
- information extraction
- cross lingual
- web documents