SNU_IDS at SemEval-2019 Task 3: Addressing Training-Test Class Distribution Mismatch in Conversational Classification.
Sanghwan BaeJihun ChoiSang-goo LeePublished in: CoRR (2019)
Keyphrases
- class distribution
- training set
- class imbalance
- training samples
- roc analysis
- training set size
- imbalanced data sets
- highly skewed
- supervised learning
- classification accuracy
- training process
- imbalanced datasets
- cost sensitive learning
- training examples
- test set
- cost sensitive
- decision boundary
- majority class
- imbalanced data
- training data
- feature selection
- test data
- learning algorithm
- class labels
- image classification
- minority class
- highly imbalanced
- feature extraction
- nearest neighbor
- high dimensionality
- text classification
- training dataset
- error rate
- machine learning methods
- unlabeled data
- high dimensional data
- active learning
- support vector
- decision trees
- data sets