Norm of Word Embedding Encodes Information Gain.
Momose OyamaSho YokoiHidetoshi ShimodairaPublished in: EMNLP (2023)
Keyphrases
- information gain
- document frequency
- stop words
- feature selection
- text categorization
- decision trees
- mutual information
- chi square
- chi squared
- co occurrence
- naive bayes
- n gram
- vector space
- occurrence frequency
- keywords
- gini index
- correlation based feature selection
- data sets
- gain ratio
- term frequency
- feature extraction
- data mining