The hypergeometric test performs comparably to TF-IDF on standard text analysis tasks.
Paul SheridanMikael OnsjöPublished in: Multim. Tools Appl. (2024)
Keyphrases
- text analysis
- tf idf
- text documents
- performs comparably
- text mining
- information retrieval
- information extraction
- text categorization
- text classification
- vector space model
- text collections
- document clustering
- wordnet
- natural language processing
- topic models
- retrieval model
- machine learning
- named entities
- bag of words
- keywords
- databases
- bayesian networks
- feature extraction
- image classification
- language model
- image retrieval
- training set
- expert systems
- multimedia
- knowledge base