First byte: Force-based clustering of filtered block N-grams to detect code reuse in malicious software.
Jason UpchurchXiaobo ZhouPublished in: MALWARE (2013)
Keyphrases
- n gram
- code reuse
- source code
- language model
- software evolution
- software systems
- bag of words
- language independent
- clustering algorithm
- text classification
- software development
- variable length
- clustering method
- k means
- inside outside algorithm
- document clustering
- anomaly detection
- software engineering
- software architecture
- web documents
- software projects
- software maintenance
- image quality
- part of speech
- character n grams
- machine learning
- neural network