Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection.
Lannan LuoJiang MingDinghao WuPeng LiuSencun ZhuPublished in: SIGSOFT FSE (2014)
Keyphrases
- plagiarism detection
- binary codes
- source code
- hamming distance
- similarity search
- similarity measure
- databases
- duplicate detection
- cross language
- open source
- image collections
- machine learning
- hash functions
- distance measure
- edit distance
- high dimensional data
- high dimensional
- feature space
- keywords
- high level
- data mining