Re-embedding Difficult Samples via Mutual Information Constrained Semantically Oversampling for Imbalanced Text Classification.
Jiachen TianShizhan ChenXiaowang ZhangZhiyong FengDeyi XiongShaojuan WuChunliu DouPublished in: EMNLP (1) (2021)
Keyphrases
- mutual information
- text classification
- feature selection
- class imbalance
- minority class
- information theoretic
- image registration
- similarity measure
- bag of words
- vector space
- semantic features
- information gain
- n gram
- text mining
- machine learning
- data cleaning
- class distribution
- imbalanced datasets
- data sets
- medical image registration
- semantic information
- naive bayes
- data mining
- neural network
- multi label
- knn
- natural language