The Similarities of Text Documents.
C. van NoortwijkRichard V. de MulderPublished in: J. Inf. Law Technol. (1997)
Keyphrases
- text documents
- text mining
- text classification
- text categorization
- news articles
- wordnet
- topic models
- document classification
- keywords
- information extraction
- document clustering
- tf idf
- bag of words
- text data
- textual information
- named entities
- automatic text categorization
- similarity measure
- text collections
- feature vectors
- text corpus
- training set
- multiscale
- decision trees
- learning algorithm
- machine learning