Document representation combining concepts and words in Chinese text categorization.
Chao CheHongFei TengPublished in: NLPKE (2009)
Keyphrases
- document representation
- text categorization
- text documents
- document frequency
- text classification
- bag of words
- document categorization
- document clustering
- text representation
- term frequency
- feature selection
- language model
- text data
- document collections
- knn
- background knowledge
- web documents
- vector space model
- vector space
- semantic information
- data fusion
- semi supervised learning
- text mining
- n gram
- tf idf
- machine learning
- labeled data
- information extraction
- information retrieval
- k nearest neighbor
- pairwise
- keywords
- training data