A simple probabilistic explanation of term frequency-inverse document frequency (tf-idf) heuristic (and variations motivated by this explanation).
Lukás HavrlantVladik KreinovichPublished in: Int. J. Gen. Syst. (2017)
Keyphrases
- tf idf
- term frequency inverse document frequency
- keywords
- information retrieval
- text categorization
- vector space model
- text documents
- document clustering
- retrieval model
- term weighting
- term frequency
- ranking algorithm
- data mining
- machine learning
- probabilistic model
- information retrieval systems
- vector space
- clustering method
- web search
- high level
- search engine