Unsupervised Documents Categorization Using New Threshold-Sensitive Weighting Technique.
Frédéric EhrlerPatrick RuchPublished in: AIME (2007)
Keyphrases
- automatic categorization
- text categorization
- document categorization
- automatic text categorization
- document collections
- term weighting
- text documents
- xml documents
- document classification
- relevant documents
- information retrieval systems
- topic modeling
- information retrieval
- legal documents
- metadata
- web documents
- weighting scheme
- word frequency
- document clustering
- document retrieval
- unsupervised learning
- multimedia documents
- database
- semi supervised
- knn
- user queries
- multi document summarization
- supervised learning
- term frequency
- tf idf
- vector space model
- keywords
- retrieval systems
- weighting schemes
- training data
- machine learning
- unsupervised manner
- document analysis
- text mining
- text classification
- free text
- ranked list
- semantic information
- query terms