An efficient classification approach in imbalanced datasets for intrinsic plagiarism detection.
Andrianna PolydouriEleni VathiGeorgios SiolasAndreas StafylopatisPublished in: Evol. Syst. (2020)
Keyphrases
- imbalanced datasets
- plagiarism detection
- class imbalance
- cost sensitive learning
- class distribution
- imbalanced data
- support vector machine
- decision trees
- classification accuracy
- training set
- source code
- benchmark datasets
- sampling methods
- ensemble methods
- machine learning methods
- classification algorithm
- feature selection
- svm classifier
- support vector machine svm
- active learning
- support vector
- data sets
- image classification
- feature set
- training dataset
- nearest neighbour
- feature vectors
- training data
- tree kernels
- machine learning