Classification of RSS-Formatted Documents Using Full Text Similarity Measures.
Katarzyna Wegrzyn-WolskaPiotr S. SzczepaniakPublished in: ICWE (2005)
Keyphrases
- document classification
- similarity measure
- information retrieval systems
- feature vectors
- pattern recognition
- machine learning
- web documents
- automatic categorization
- feature extraction
- document categorization
- classification algorithm
- information retrieval
- pre classified
- journal articles
- retrieval systems
- supervised learning
- classification accuracy
- feature selection
- feature space
- digital libraries
- decision trees
- support vector machine svm
- classification method
- text categorization
- support vector
- text documents
- text classifiers
- learning algorithm
- training data
- cosine similarity
- automatic classification
- keywords
- document clustering
- document retrieval
- context aware
- image classification
- training set