Detecting and Mitigating Sampling Bias in Cybersecurity with Unlabeled Data.
Saravanan ThirumuruganathanFatih DenizIssa KhalilTing YuMohamed NabeelMourad OuzzaniPublished in: USENIX Security Symposium (2024)
Keyphrases
- unlabeled data
- labeled data
- semi supervised learning
- semi supervised
- active learning
- sample selection bias
- co training
- semi supervised classification
- supervised learning
- training set
- data points
- labeled training data
- text classification
- labeled examples
- text categorization
- learning algorithm
- labeled and unlabeled data
- training data
- domain adaptation
- class labels
- random sampling
- small set of labeled
- number of labeled examples
- label propagation
- supervised learning algorithms
- positive examples
- unsupervised learning
- prior knowledge
- semisupervised learning
- machine learning
- class distribution
- training examples
- data sets
- multi view learning
- semi supervised learning algorithms
- labeled instances
- target domain
- labeled data for training